|
Caltech Update - Diane Trout
Current accomplishments:
* Built a number of packages to help install genex
* Helped brandon with some of the bugs he's run into with installing
genex.
* A whole lot of meetings
Next week:
* Work on conneting novosoft xmi reader to pymerase (high priority)
(for both an internal project as well as Python-Mage)
* Fix issues that brandon found in pymerase (high)
* Work on connecting pymerase to GeneX's new security model (mid)
* Build debs for Bio::Mage & GeneX (low)
Problems:
* Haven't had time to work on debs or preping a download site to be
hosted on sourceforge.
Caltech Update - Brandon King
Summery:
* Installed GeneX 2.x with help from Diane and Jason
* Started uploading QuantArray test data to GeneX 2.x
* No form to upload ExperimentSets (looks like now fixed), used SQL as shorterm solution
* Successfully generated and uploaded ArrayDesigns
* Tested uploading of existing FeatureExtractionSoftware, detected entry and didn't upload. =o)
* Tried to upload QuantArray expression data, ran into bugs, submited bug reports
* Worked on testing and updating pymerase to help prerpare for GeneX 2.x and MAGE support.
Next Week:
* Finish Uploading QuantArray expression data
* Create GenePix FeatureExtractionSoftware XML file
* Create ArrayDesign from GenePix data
* Upload GenePix expression data
* Help Diane update Pymerase
Problems:
* Usual bugs found in development projects.
Future:
* Look into ways of helping out with GeneX project more effeciently
New or Changing Priorities:
* No changes, yet.
Random Ideas:
* Maybe using a format similar to this for weekly updates, where Summery, Next Week, and Problems are most important to include.
* Go to lunch =o)
UCI Update - Harry Mangalam
...And thanks to Brandon ans Diane for the biff to the butt for getting
this started.
Over the last 2 months, I've been mostly working on analysis add-ons and
tweaks to GeneX and related gene expression paths, with the emphasis on
OpenDX:
Open Data Explorer (DX) is an advanced visualization system
developed by IBM over the last 15 years. It was initially a
commercial product aimed at high end visualization markets, using
advance single and multi-CPU Unix workstations. In 1999, as
part of their Deep Computing Initiative, IBM released the source
code to DX as Open Source. Over the years since, it has been
ported to Linux and Windows, and continues to improve. It is
particularly notable as complex visualizations can be programmed
visually by means of 'drag and dropping' icons representing data
transformations onto a canvas and 'wiring' them together by means
of connecting input and output tabs together with mouse clicks.
One problem with DX is that while it is an exceptional
visualization environment, it was developed prior to the current
emphasis on accessing data from relational databases, relying on
the large data file formats that were (and still are) standard in
these fields - primarily HDF and netCDF. This is a problem
particularly for the gene expression field, where data is
progressively stored in relational databases to support complex
queries.
[see http://www.opendx.org]
Mostly this has been in pursuit of a Perl-OpenDX link so that a module
can be added to a visual DX net allowing arbitrary Perl code to be run
on the input. There are various problems - DX prefers to operate on
particular types of data fields, not the simple string input that Perl
favors, but there are many possibilitiess, high among them using Perl's
DBI to suck data from RDBs and format it for DX to do automated analysis
and visualization on.
Status: I've gotten some DX-Perl modules built, but am still a ways away
from a solid generic module.
A close second priority is the integration of the R statistical language
in the same way - compile it as a shared lib and have it communicate to
DX via sockets in the same way that I have Perl doing. R is actually a
better fit as it has an idea about data objects and already has support
for many of the interconversions of data types that DX supports.
A logical 3rd candidate would be Python, as it theoretically integrates
more easily with C apps than Perl and there is already a Python project
for interacting with DX (tho it seems to be for controlling DX from
Python than rolling Python into DX.)
The other project is the updating and converting to commandline scripts
those cgi analysis scripts that were included in the GeneX 1.0 release.
These include the CyberT significance testing, the Eisen/Sherlock
clustering code, (now incorporating Swaine Lin Chen's slcview heatmap
generation), and Karen's Rcluster codes.
These updates should be making their way back into the CVS tree very
soon. Currently they are designed to work with a tree of
QuantArray-formatted replicates, soing significance testing on all of
the subgroups and then a summary significance test on all the subgroups
(this last summary test is specific to the experiment, so it may not be
appro for all expt'al designs).
I'm also just about finished a larger descriptive page describing the
software in more detail and will post the location as soon as it's
ready, probably next week.
My last noise is a suggestion of starting to put together ideas for
making an external GUI app that does nothing but query the GeneX DB and
makes the output available as:
- tab-delimited spreadsheet-like files
- MAGE-ML
- N-dimensional file such as netCDF/HDF/XDF
- maybe a few others
My suggestion would be to use Python and Qt so that it runs
cross-platform, using either the Qt designer (or BlackAdder if it EVER
sees the light of day). The Python bindings seem to be more stable and
coherent than for Perl.
UVa Update - Tom Laudeman
Following Harry's good example, here's what we're up to:
We're showing GeneX to end users this morning. We gave it to the Microarray Center people last week, but they've had a lull in orders.
Jodi/Teela has finished the Analysis Tree schema. I think it has 8 tables. This schema allows for the linking analyses together, and for the analysis module plugins.
Teela has been working on the analysis tree backend.
I've got the trees drawing correctly, nodes delete, add, and rename. Trees render correctly no matter how complex. I created this before the db was ready, so I've been integrating with the Analysis Tree schema.
The tree drawing is way cool. We can set up accounts on reed6.med for people who want to see it.
Open Informatics Update - Jason Stewart
Here's the status for me and my sub-contractors, Mark Wilkinson, and
Hyojoo Kang.
Mark:
Added sysadmin tools for doing users/group mainenance and
created Mason front ends for them.
Hyojoo:
Created a Java GUI for creating QT Dimensions that uses the MAGE-Java
API and thus can export MAGE-ML. The MAGE-ML QT Dimension is needed to
configure the data loader.
Jason:
Planning
========
I've been getting ready to switch to Postgres7.3
Service
=======
Helped Brandon get Genex-2.0a1 running at Caltech
Genex-2
=====
New Tools
--
* create user tools for adding Group's and ExperimentSet's and created a
mason front end for them.
* Install - now uses a MANIFEST file. This file documents all local
files to be installed and where they will be installed. This is a
substitutable file (created from MANIFEST.in), so it is possible to
use the Config.pm values.
Schema changes
--
* ExperimentSet - removed creation_date column. This information can
be found by using the Audit trail.
Modifications to Perl API
--
* XMLUtils.pm - changed the internals of xml2sql() to use the new SQL
writing functions in Connect.pm. Added creation of rules to all
security views for INSERT/UPDATE/DELETE. Removed permission granting
code to make it more generic. Added support for a single master
sequence to be used by all tables.
* create_genex_db.pl - modified to handle granting of permissions.
Goals for next week
===================
Finish rules code - need to fix problem with GroupSec table having
ro/rw_groupname fkeys to itself.
Ensure that ArrayDesign importer can handle Affy U133 files.
Add ability to data loader to import .CEL data and MAS5 data.
|