Code Monkey home page Code Monkey logo

Comments (10)

BorisYourich avatar BorisYourich commented on August 12, 2024 1

From the 400-something species that are in the set of ribosomal transcripts I have created clusters based on taxonomical rank "class" to obtain the most diverse subset of species and picked one from each of the 50 cluster and mined the SRA for experiments which are Single-end, Paired-end and from one interval of the specified lengths(<30,<80, <130, <180). I obtained +300 results, from which only 83 had a pubmed-id associated with them, those are the records that I appended into the HTSinfer test data table. I didn't yet go through the publications, maybe it would be nice if some other pair of eyes checks it, anyway no problem for me to search for library and adapter info in the publications, just not sure wether we need so many test datasets, or should we want more perhaps?

from htsinfer.

uniqueg avatar uniqueg commented on August 12, 2024 1

Awesome job @BorisYourich!

Actually, we were planning to spend an afternoon session with 10 people or so mining, so that the work isn't so dull on any one person. Your work is absolutely perfect for that, because all we need to do is go through the publications and check for the info we need (sequencing kit, most importantly, or adapter, length, orientation directly, if available).

FYI @mzavolan

from htsinfer.

BorisYourich avatar BorisYourich commented on August 12, 2024 1

Hey, I am sorry I didnโ€™t collect all the samples yet, had a very hectic two weeks with onboarding on the new project, though I dedicated tommorow to this project, will run the mining scripts and share all the results.

from htsinfer.

uniqueg avatar uniqueg commented on August 12, 2024 1

I think we have done this, no, @balajtimate? I will close it, please re-open if you think it's not solved.

from htsinfer.

BorisYourich avatar BorisYourich commented on August 12, 2024

Alright, I am happy you like it ๐Ÿ˜ƒ glad to have helped, good luck with mining the publications ๐Ÿ‘

from htsinfer.

uniqueg avatar uniqueg commented on August 12, 2024

Looking at the list of organisms that you ended up sampling, I thought that maybe we could also have some more representation by the more common organisms and closely related species. Would it be easy for you to go through the same selection process for a given list of species?

If so, could you do this for the following?

# primates
cjacchus
hsapiens
mmulatta
pabelii
panubis
ppaniscus
ptroglodytes

# livestock
btaurus
chircus
ecaballus
ggallus
oaries
sscrofa

# rodents
cporcellus
mauratus
mmmarmota
mmusculus
ocuniculus
rnorvegicus

# fungi
pmexicana
scervisiae
spombe

# plants
athaliana
bvulgaris
cjaponica
osativa
zmays

# worms
celegans
cbriggsae

# fish
drerio
olatipes
xmaculatus

# insects
amellifera
bmori
dmelanogaster

# amphibians & reptiles
acarolinensis
xtropicalis

from htsinfer.

BorisYourich avatar BorisYourich commented on August 12, 2024

Sure no problem, though I wonโ€™t manage today. Hopefully tomorrow it will be ready.

from htsinfer.

uniqueg avatar uniqueg commented on August 12, 2024

Great. Tomorrow is totally fine. In fact, the earliest we will need it is next Friday.

from htsinfer.

uniqueg avatar uniqueg commented on August 12, 2024

Ah, before I forget: We kinda need the SRR identifiers of the actual libraries. We can easily get them from the SRX identifiers, so no need to change that for the available ones. But if it's easy to include them for the new ones, then it would be good to include them.

from htsinfer.

uniqueg avatar uniqueg commented on August 12, 2024

@BorisYourich: did you have a chance to look at this?

from htsinfer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.