Early patterns of the TB.

Understanding pedigrees, inbreeding, dosage, etc.

Moderators: Roguelet, hpkingjr, WaveMaster, Lucy

User avatar
Pan Zareta
Breeder's Cup Winner
Posts: 2074
Joined: Wed Dec 22, 2004 10:55 am
Location: west TX boonies

Postby Pan Zareta » Sun Nov 23, 2008 1:52 pm

diomed wrote:
xfactor fan wrote:Any idea on when the new Hill study is going to be published?

Can't wait to see that one.


Do you have info on what they are studying?
More TB families perhaps?
I can't wait either.


Why wait? :wink: See Genbank accession #s EU580148-EU580172. The working title (no publication to date) is "Thoroughbred horse mitochondrial DNA reveals that maternal pedigree sub-strain or sub-family designation is a better predictor of genetic history of descent than strain or family designation". The nomenclature in the accession definitions is fairly straightforward, as was that of the TB submissions for the 2002 study (AF481305-AF481323), meaning it's not too difficult to associate TB female family/families w/ accession #'s. Looks to me like one or more members of family 15 have been sampled and by mtDNA is probably consistent w/ haplotype E as defined in the 2002 study. However, the Genbank submissions are short sequence (bp 15476 - 15818), not the longer sequence (15456 - 15837) that was used to analyze TB haplotypes for that study. Since one of haplotype E's segregating sites is 15827 it's impossible to know for sure.

User avatar
diomed
Grade III Winner
Posts: 1142
Joined: Mon Oct 04, 2004 4:16 pm

Postby diomed » Sun Nov 23, 2008 6:03 pm

Thanks Pan Zareta! :)

aethervox
Allowance Winner
Posts: 395
Joined: Sat Mar 15, 2008 11:39 am

Postby aethervox » Mon Nov 24, 2008 12:40 am

Pan Zareta wrote:The map above seems to suggests otherwise, but then again it appears to have been created using somewhat different sequence motifs from Jansen's.


The map was created by downloading whatever sequences were in Genbank for the breeds I wanted to compare, then aligning them and trimming the sequences down to what they had in common.

I did not try to correlate the branches with either Jansen or Hill, et. al. because my sample set was so much more limited than theirs was.

Sorry about the delay in responding - I wasn't notified of activity on this thread :shock:

aethervox
Allowance Winner
Posts: 395
Joined: Sat Mar 15, 2008 11:39 am

Postby aethervox » Mon Nov 24, 2008 1:22 am

Pan Zareta wrote:Why wait? :wink: See Genbank accession #s EU580148-EU580172. The working title (no publication to date) is "Thoroughbred horse mitochondrial DNA reveals that maternal pedigree sub-strain or sub-family designation is a better predictor of genetic history of descent than strain or family designation". The nomenclature in the accession definitions is fairly straightforward, as was that of the TB submissions for the 2002 study (AF481305-AF481323), meaning it's not too difficult to associate TB female family/families w/ accession #'s. Looks to me like one or more members of family 15 have been sampled and by mtDNA is probably consistent w/ haplotype E as defined in the 2002 study. However, the Genbank submissions are short sequence (bp 15476 - 15818), not the longer sequence (15456 - 15837) that was used to analyze TB haplotypes for that study. Since one of haplotype E's segregating sites is 15827 it's impossible to know for sure.


Some of the accession definitions are straightforward, but I sure would like to know for certain what haplotype the hyphenated designations pertain to. Some I've been able to figure out, like 2-8-16 = F since those families share that haplotype, but 10-14-42 or 6-20-23 I'm not as sure about.

xfactor fan
Breeder's Cup Winner
Posts: 2212
Joined: Thu Sep 16, 2004 8:46 pm

Postby xfactor fan » Mon Nov 24, 2008 10:25 am

I'd like to know that too. Is there a something published that says F= this kind of code? Also is there anyway to figure which horses were tested?

aethervox
Allowance Winner
Posts: 395
Joined: Sat Mar 15, 2008 11:39 am

Postby aethervox » Mon Nov 24, 2008 11:07 am

xfactor fan wrote:I'd like to know that too. Is there a something published that says F= this kind of code? Also is there anyway to figure which horses were tested?


In Hill's paper, she gives which Haplotypes were found in which TB families. She also gives a table of where the genes differed.

The Haplotypes she assigned letters to are different than the letters for the Clades that Vila et. al. described in their paper, but she has a diagram that shows where her Haplotypes fit into Vila's Clades.

I'm updating the tree I did earlier and will post a link later on.

User avatar
Pan Zareta
Breeder's Cup Winner
Posts: 2074
Joined: Wed Dec 22, 2004 10:55 am
Location: west TX boonies

Postby Pan Zareta » Mon Nov 24, 2008 1:37 pm

aethervox wrote:Some of the accession definitions are straightforward, but I sure would like to know for certain what haplotype the hyphenated designations pertain to. Some I've been able to figure out, like 2-8-16 = F since those families share that haplotype, but 10-14-42 or 6-20-23 I'm not as sure about.


Between bp 15476-15818 (the short sequence submitted to Genbank) 10-14-42 is consistent w/ haplotypes B found in FF 10, and D found in FF 14 (Hill et al., 2002 p.288).

6-20-23 is similarly consistent w/ haplotype N, found in one member of FF 6 descending from Cream Cheeks by stud book rcds. It was not considered to represent the founder haplotype for that family.

It seems a reasonable guess that representatives of FF's 42 and 20, 23 were sampled and found to be identical to 10, 14 and 6, respectively, at short sequence, although they may segregate outside that sequence. Haplotypes B and D segregate at bp 15827, and haplotype N varies from the reference sequence there. N may be one of at least two haplotypes found in samples from FF's 20 and 23, since two other submissions representing different haplotypes were defined as TB-20 and TB-23.

In regard to the maps, what I meant by sequence motifs is whether or not the entire sequence had been used. Jansen et al. excluded bp 15585, 15597, and 15650 as "mutational hotspots", and downweighted 15659 and 15737 as "hypervariable".

aethervox
Allowance Winner
Posts: 395
Joined: Sat Mar 15, 2008 11:39 am

Postby aethervox » Mon Nov 24, 2008 5:51 pm

Pan Zareta wrote:In regard to the maps, what I meant by sequence motifs is whether or not the entire sequence had been used. Jansen et al. excluded bp 15585, 15597, and 15650 as "mutational hotspots", and downweighted 15659 and 15737 as "hypervariable".


No, the software I was using isn't that sophisticated. I used the entire sequence.

xfactor fan
Breeder's Cup Winner
Posts: 2212
Joined: Thu Sep 16, 2004 8:46 pm

Postby xfactor fan » Tue Nov 25, 2008 9:36 pm

Here's my understanding of the mtDNA families. Please make any corrections if the any part of this is wrong.

Thanks.

English Families:

#1 H
#1 F from Maid of the Glen
#2 F
#3 E
#4 J
#5 L
#5 M from Hag
#5-e Sequenced no letter assigned
#6 C from Betty Percival
#6 N from Cream Cheeks
#7 F
#8 F
#8-c Sequenced no letter assigned
#9 A from Maid of Marsham
#9 G from Sister to Sloven
#10 B
#11 J
#11 L from Young Camilla
#11 P
#11-f Sequenced no letter assigned
#12 G
#12 Q
#12-b Sequenced no letter assigned
#13 J
#14 D
#15 E Sequence same as #3
#16 F
#16 H from Lady Alice
#17 F
#18 E Sequence same as #3
#19 K
#19 O from Violet
#20 Sequenced no letter assigned
#21 Sequenced no letter assigned
#22 F
#23 ??
#24 ??
#25 I
#26 Sequenced no letter assigned
#B4 Sequenced no letter assigned
#B3 Sequenced no letter assigned
#A4 Sequenced no letter assigned

#10 Other family members are B
#14 Other family members are D
#42 Sequenced no letter assigned

#6 Other family members are C or N
#20 ??
#23 ??


Family #6 sequence EU580158 is not the same as the EU580159 #6, #20, #23 So making a leap of logic the main branch of Family #6 is C, and the Cream Cheeks branch of #6 and #20 and #23 are N

User avatar
diomed
Grade III Winner
Posts: 1142
Joined: Mon Oct 04, 2004 4:16 pm

Postby diomed » Wed Nov 26, 2008 5:11 am

Thanks X factor!
So, family #s 3,15, and 18 share the same ancestress from some point and time?
Interesting.
Of course, it could go WAY back.

Great seeing this thread stay alive!

User avatar
Pan Zareta
Breeder's Cup Winner
Posts: 2074
Joined: Wed Dec 22, 2004 10:55 am
Location: west TX boonies

Postby Pan Zareta » Wed Nov 26, 2008 11:14 am

diomed wrote:Thanks X factor!
So, family #s 3,15, and 18 share the same ancestress from some point and time?


No. The short sequence defined as 3-15 = haplotype E. The one defined as 3-18 is a haplotype previously undocumented in family 3. At short sequence it is identical to the reference sequence.

aethervox
Allowance Winner
Posts: 395
Joined: Sat Mar 15, 2008 11:39 am

Postby aethervox » Wed Nov 26, 2008 3:19 pm

After analyzing all of the Hill et al. sequences, I've discovered a few anomalies:

Haplotypes P and J are indistinguishable using her GenLink sequences. Her submitted sequences appear to be from 15,476 to 15,820 which makes me wonder why she didn't submit samples that included nucleotides 15465 (where P is different from J), and 158210-15827.

There are problems with Table 2 - Nucleotide position 15584 is C not T in the reference sequence, and 15807 appears to match position 15806 in my alignment.

Given those anomalies, though:

I found additional differences in the 2008 samples at positions 15479, 15486, 15495, 15563, 15574, 15651 and 14720.

The family marked 3-18 matches the reference pattern (as Pan Zareta pointed out).
2-8-16 is haplotype F (shares with Fams 2, 8, 16).
3-15 is haplotype E (shares with Fam 3)
4, 11, 13, and 4-11-13 share the same haplotype (presumably J)
5e and 6-20-23 share the same haplotype (matches M in original study)
6-20-23 matches N haplotype.
7-17-11 is also haplotype F (shares with Fams 7, 17, 22)
9-12 is haplotype G (shares with Fam 12)
10-14-42 is haplotype D (shares with Fam 14).
11f differs from the J/P haplotype (which includes Family 11) by two basepairs (15479 and 15486)
12b differs from 3-18 and the reference by one basepair (15495)
20 matches O haplotype.
A4 is one basepair different than haplotype H (location 15574)

The rest, 12, 26, 8c, B3, and B4 don't match any of Hill's other published haplotypes as far as I can tell.

xfactor fan
Breeder's Cup Winner
Posts: 2212
Joined: Thu Sep 16, 2004 8:46 pm

Postby xfactor fan » Wed Nov 26, 2008 3:33 pm

Is family 10, B or D? The TB heritage site has it as B. Was there a revision or more data to revise this?

aethervox
Allowance Winner
Posts: 395
Joined: Sat Mar 15, 2008 11:39 am

Postby aethervox » Wed Nov 26, 2008 3:41 pm

xfactor fan wrote:Is family 10, B or D? The TB heritage site has it as B. Was there a revision or more data to revise this?


I have family 10 as B - in Hill's paper it differs from D by one basepair in location 15827. B matches the reference (A) while D is G.

xfactor fan
Breeder's Cup Winner
Posts: 2212
Joined: Thu Sep 16, 2004 8:46 pm

Postby xfactor fan » Wed Nov 26, 2008 6:45 pm

10-14-42 is haplotype D (shares with Fam 14).

So are these guys B or D?

Or is family 10 split with some members being B, and others D?