lunes, 24 de septiembre de 2018

Pharo Script of the Day: Parsing Gene Ontology terms

Today a script from the Bioinformatics world, accessing an EBI REST service called QuickGO, to retrieve Gene Ontology information. For the first script I use the NeoJSON library, and return a Pharo dictionary which we can inspect interactively through the Inspector:

(NeoJSONReader on: (ZnClient new
 accept: ZnMimeType applicationJson;
 url: ',GO:0017071,GO:0030680';
 get) readStream) next at: 'results'.

You should only provide the GO identifiers delimited by commas. For the second script we just can join a Collection to build the comma delimited String. The API also includes a service to retrieve a graph image with the terms involved in the query, so we can also try retrieving a different type of information:

| ids |
ids := #('GO:0005623' 'GO:0017071' 'GO:0030680').
(ImageReadWriter formFromStream: (ZnClient new
 accept: ZnMimeType imagePng;
 url: '' , (ids joinUsing: ',') , '/chart';
 get) readStream) asMorph openInWindow.

Previously, EBI returned XML content in oboxml format, and the following Pharo script used XML-Parser and XPath (doc) libraries to parse GO terms. I include the script for those discovering Smalltalk and wish to know how it would be using XPath:

#('GO:0005623' 'GO:0017071' 'GO:0030680') collect: [ : goTerm |
 | quickGO |
 quickGO := '{1}&format=oboxml' format: { goTerm }.
 goTerm -> (XPath 
  for: 'normalize-space(/obo/term/name/text())' 
  in: (XMLDOMParser on: (ZnEasy get: quickGO) contents) parseDocument) ].

You've got similar code in Perl and Java so you can compare:


0 comentarios:

Publicar un comentario