Friday, December 26, 2008

Perl is useful too

I subscribed the mail lists of china-pm and bio-perl-pm. Actually I didn't look into it very carefully until just now. Some guys introduce the use of perl to develop the plug-in for firefox, which shocks me a lot. Perl can be used to such a high level, but I give it up. Anyway, regret is useless. Try to grasp both Perl and Java. Be a good programmer soon.

Thursday, December 25, 2008

Merry Christmas

Merry Christmas! A new year is coming. Work harder and harder!
Today I plan to finish the technical report to Keith.
Here we go.

Wednesday, December 24, 2008

To Be Creative

A good programmer is not always Using the codes of others, rearranging them and modifying them, but do some creative work. Create the methods and codes of their own, for others.

I want to and will be such a good programmer later.

Monday, December 22, 2008

Some interesting website

Today when I am surfing online, I found some website very interesting.
First, metabolicvisualizer, very creative website. It lets you control the contents of the involved elements, such as glucose, ATP, and so on. With your adjusting, the reactions inside the big cycle area are automatically adjusted. Very interesting.

Second, DAVID, The Database for Annotation, Visualization and Integrated Discovery. DAVID now provides a comprehensive set of functional annotation tools for investigators to understand biological meaning behind large list of genes.

A paper published in Nature Protocols describes step-by-step procedure to use DAVID:

Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources. (2009) Nat. Protoc. 4:44 -57.

http://www.nature.com/nprot/journal/v4/n1/abs/nprot.2008.211.html

Note: This paper, commissioned by the Nature-branded journal -- Nature Protocols, systematically describes the rationals and procedures to use the DAVID Bioinformatics tools. By following the step-by-step procedure in the paper, readers will be able to more efficiently use DAVID bioinformatics tools for high-throughput gene functional analysis, leading to more meaningful analytic experiences and maximal satisfactory results. In this moment, subscription to Natrue web site is required to view full text. We will provide free archive-version of the paper shortly.

Using Network Alignment to serve the discovery of undealying disease

In the recent reports, the most question which was thrown to me is, what the hell of designing a network comparison algorithm can be used to? Now, Xuebing Wu et al. help me to give one answer: network alignment can be used to detect or predict the underlying disease families. (The other applications include reconstructing the phylogenetic relationships of organisms, integrating data of genome and proteome, querying the interested pathway within the known pathway databases, and so on.)

Here is the link of this paper: http://bioinformatics.oxfordjournals.org/cgi/content/short/25/1/98?. Thank Samuel for helping download it.

The coming review and three technique report

Our annual conference would be held earlier next month, say, around Jan 5th. In addition, we will have a pre-report for the whole year work summary before that, which probably is the next Tuesday.Thus, in this week I need to start and finish a review paper,and two reports for the annual conference. One is in Chinese and the other is in English. These two coming reports are technical ones, which need a lot of data from pre-experiments. In addition to these, I also need to prepare three detailed reports for Keith for the further discussion, which are related to the three thoughts I mentioned in the last letter and also in the prepared proposal respectively.

In a word, this week I would have a heavy paper work. They would be:

1. One big review for peer review, which is on the progress of the methods network comparison applied in the phylogenetic analyses.

2. Two technical reports, one in English and another in Chinese. They are requested before the annual conference and would help me prepare the presentation.

3. Three detailed reports in line with my thoughts mentioned to Keith. In order to have a good discussion later, I need to prepare each of them as detailed as possible.

Today I would like to start with the second one, the simplest, and try to finish it.

Here we go.


Wednesday, December 17, 2008

What would I do today

I attend the regular biweekly report this morning.
I will try my best to write the detailed report for Keith, which will be started after a while.
For the annual conference, I need to prepare two review as requested. So I'd like to do it from now on. One is for the review on network comparison algorithm, the other is the detailed thought record on the pathway alignment and the network comparison algorithm based on the pathway alignment.

Here we go.

Tuesday, December 16, 2008

What would I do today 2008.12.16

Today I mainly plan to record my thoughts in the draft for the new paper.
Finish the schema and start the experiments.
Plus, I need to prepare for the presentation which would be given on the Tomorrow's conference.
Plus plus, I'd like to form a detailed slides for the discussion with Keith.
Ok, here we go.

(+++: I need to find some time to manage my papers. Keep and print out the useful ones and delete the vice versa. )

It seems that I always change my work plan according to my thoughts arbitrarily.

Let's see what I really did today.
1. I write the schema for the doctoral dissertation, which I am sure will direct my future work helpfully.
2. Record my thoughts(pending)

I Finished the proposal

Finally I finished the proposal. Congratulations!

Sunday, December 14, 2008

Mahalanobis Distance

Just now I saw a paper named

Using Mahalanobis distance to compare genomic signatures between bacterial plasmids and chromosomes

on Nucleic Acids Research.

To say the truth, I don't care what the paper are studying, but Mahalanobis distance grasped my eyesights as soon as I saw this title. What's Mahalanobis distance? I know nothing about it before.

In Chinese it is usually translated into "马哈朗诺比斯距离(马氏距离)"。Here is something helpful from someone's blog. (From: http://rogerdhj.blog.sohu.com/39020502.html )

定义:p维空间的两点(两个p维向量x,y)的距离定义为:

并且点x欧氏模数为:

这里很快可以得出,所有到原点距离相等的点满足

这是某个正球体的方程。这就是说观测数据x的各个分量对x至中心的欧式距离贡献是相等的。然而在统计学中我们希望寻求这样一种距离,它的各个分量的作用程度是不同的。差别较大的分量应该接受较小的权重。

然后定义x,y之间的距离

这里

现在x的模数等于

所有到原点等距离的点满足

这是以原点为中心的某个椭球体的方程。

Very clear, right? But please note the essential point of Mahalanobis Distance: The bigger the component of distance between two objects is, the smaller the corresponding weights should be.


Here is an example for application of Mahalanobis Distance on detecting the odd values. (From: http://nanapple.happy.blog.163.com/blog/static/77501222200883945195/)

之所以把它们称为异常值,是因为它们与众不同,远离大部分数据。它们有可能是一些错误数据,将会破坏您的分析结果。或者它们有可能是一些真实存在的现象,正在等待您的发现和理解,以便进行一些精彩的应用。无论是哪一种情况,您都应该重视它们。

对于一维数据 -- 他们只是一些极端值,很容易被发现。

 

 

对于二维数据 -- 异常值在一些偏僻的方向延伸出来。如果变量具有相关性,那您会看到异常值在二维的方向延伸出来,而不是在某个维度分别延伸出来。您可以通过测量该点与正态分布云图的偏离距离来量化它的偏移。该距离称为马哈朗诺比斯距离(Mahalanobis distance)。



对于三维数据 --
三维旋转图用于发现三维的异常值。如果您的数据变量多于三维,那您不得不使用其它的技术。如果您的数据变量都具有相关性, 您将可以看到您的数据有着一定的延伸方向。同时您可以看到异常值从偏僻的方向延伸出来。所以您可以选取三个主要变量来制作三维旋转图,以发现异常值。

 

考虑N维的情况 另一方面,您可以考虑整个相关矩阵,为每一个观测计算其马哈朗诺比斯距离。再从多元均值中得到N维的距离。但是这样一来,所有的观测,变量,包括被测量值 本身都会被考虑进去,这使得测量出的距离与被测量值具有相关性,影响结果的准确性。所以在这种情况下,使用折叠距离(Jackknifed distance)会更好 -- 每一点将与不包含该点的观测进行距离测量。

 



如果您正在拟合模型,您可能会想知道每一个观测对结果的影响。此时您可以使用杠杆图(Leverage plot)。它将显示某个观测的残差以及该残差对模型所造成的影响。如果您希望从数据中发现潜在信息,灵活运用JMP强大的图形工具绝对会对您有很大的帮助。


Seems there are still a lot of novel distance definition which is unknown to me. Let me think it over, what this distance can help me in my research?

Friday, December 12, 2008

Wednesday, December 10, 2008

What would I do today

code and reading. and blog.

Inner Classes

Q1:

When you create an inner class, an object of that inner class has a link to the enclosing object that made it, and so it can access the members of that enclosing object -- without any special qualifications. In addition, inner classes have access rights to all the elements in the enclosing class.

How to understand this paragraph?

Answer: The inner class secretly captures a reference to the particular object of the enclosing class that was responsible for creating it. Then, when you refer to a member of the enclosing class, that reference is used to select that member. Construction of the inner-class object requires the reference to the object of the enclosing class, and the compiler will complain if it cannot access that reference. Most of the time this occurs without any intervention on the part of the programmer.

Q2. Does an outer class have access to the private elements of its inner class?

Seems the answer is yes from the result of my little test code. But why?

Q3.
If you're defining an anonymous inner class and want to use an object that's defined outside the anonymous inner class, the compiler requires that the argument reference be final. If you forget, you'll get a compile-time error message. Why?

Tuesday, December 9, 2008

Interfaces

The fields in an interface are implicitly static and final. The fields, of course, are not part of the interface. The values are stored in the static storage area for that interface.

Factory Method design patter: instead of calling a constructor directly, you call a creation method on a factory object which produces an implementation of the interface -- this way, in theory, your code is completely isolated from the implementation of the interface, thus making it possible to transparently swap one implementation for another.

An appropriate guideline is to prefer classes to interfaces. Start with classes, and if it becomes clear that interfaces are necessary, then refactor. Interfaces are a great tool, but they can easily be overused.

What I do today 2008.12.9

1. The Chapter of Interfaces and Inner Classes.

2. Record my thoughts on the Doctoral Defense Conference of LiuWei.

Monday, December 8, 2008

What would I do today - 2008.12.08

This morning I attended the pre-defence of Liu wei, one of my group mates. She is the one who started the doctoral study with me together. But now, her study life comes to the end while I'm still struggling in it. Sigh, nothing but pushing myself harder.
Here is my work plan for this afternoon and tonight:
1. Review one paper for CEC'09, which would take me 2~3 hours I guess. (finished)
2. Recompose the proposal for Prof. Wang. It must be finished before his arrival tomorrow. (pending)
3. Finish the First part of Thinking in Java. (finished)

Thursday, December 4, 2008

The Key Words: static & public & final

public: so they are usable outside the package;
static: to emphasize that there's only one
final : to say that it's a constant.

Note that final static primitives with constant initial values(that is, compile-time constants)

Choosing composition vs. inheritatnce

Both composition and inheritance allow you to place subobjects inside your new class(Composition explicitly does this- with inheritance it's implicit.) You might wonder about the difference between the two, and when to choose one over the other.

Composition is generally used when you want the features of an existing class inside your new class, but not it's interface. That is, you embed an object so that you can use it to implement features in your new class, but the user of your new class sees the interface you've defined for the new class rather than the interface from the embedded object. For this effect, you embed private objects of existing classes inside your new classes.

Sometimes it makes sense to allow the class user to directly access the composition of your new class; that is, to make the member objects public. The member objects use implementation hiding themselves, so this is a safe thing to do. When the user knows you're assembling a bunch of parts, it makes the interface easier to understand.

When you inherit, you take an existing class and make a special version of it. In general, this means that you're taking a general-purpose class and specializing it for a particular need.

The is-a relationship is expressed with inheritance, and the has-a relationship is expressed with composition.

In OOP, the most likely way that you'll create and use code is by simply packaging data and methods together into a class, and using object of that class. You'll also use existing classes to build new classes with composition. Less frequently, you'll use inheritance. So although inheritance gets a lot of emphasis while learning OOP, it doesn't mean that you should use it everywhere you possibly can. On the contrary, you should use it sparingly, only when it's clear that inheritance is useful. One of the clearest ways to determine whether you should use composition or inheritance is to ask whether you'll ever need to upcast from your new class to the base class. If you must upcast, then inheritance is necessary, but if you don't need to upcast, then you should look closely at whether you need inheritance. The Polymorphism chapter provides one of the most compelling reasons for upcasting, but if you remember to ask "Do I need to upcast?" you'll have a good tool for deciding between composition and inheritance.

Overloading and Overriding

Overloading is a one of the ways in which Java implements one of the key concepts of Object orientation, polymorphism.Overloaded methods are differentiated only on the number, type and order of parameters, not on the return type of the method.(That is in brief, different signatures, different implementation, for the method)


Overriding a method means that its entire functionality is being replaced. It is something done in a child class to a method defined in a parent class. To override a method a new method is defined in the child class with exactly the same signature as the one in the parent class.(That is in brief, same signatures but different implementation)

Java SE5 has added the @Override annotation, which is not a keyword but can be used as if it were. When you mean to override a method, you can choose to add this annotation and the compiler will produce an error message if you accidentally overload instead of overriding.

The @Override annotation will thus prevent you from accidentally overloading when you don't meant to.

When to initialize an object

It makes sense that the compiler doesn't just create a default object for every reference, because that would incur unnecessary overhead in many cases. If you want the references initialized, you can do it:

1. At the point the objects are defined. This means that they'll always be initialized before the constructor is called.

2. In the constructor for that class.

3. Right before you actually need to use the object. This is often called lazy initialization. It can reduce overhead in situations where object creation is expensive and the object doesn't need to be created every time.

4. Using instance initialization.

Wednesday, December 3, 2008

static data initialization & The creating process of an object

Note:
1. The static initialization occurs only if it's necessary.
2. The static variables will only be initialized when the first static access occurs, and only be initialized once.
3. The order of initialization is statics first, if they haven't already been initialized by a previous object creation, and then the non-static objects.

To summarize the process of creating an object, consider a class called Dog:

1. Even though it doesn't explicitly use the static keyword, the constructor is actually a static method. So the first time an object of type Dog is created, or the first time a static method or static field of class Dog is accessed, the Java interpreter must locate Dog.class, which it does by searching through the classpath.

2. As Dog.class is loaded(creating a Class object), all of its static initializers are run. Thus, static initialization takes place only once, as the class object is loaded for the the first time.

3. When you create a new Dog(), the construction process for a Dog object first allocates enough storage for a Dog object on the heap.

4. This storage is wiped to zero, automatically setting all the primitive in that Dog object to their default values(zero for numbers and the equivalent for boolean and char) and the references to null.

5. Any initializations that occur at the point of field definition are executed.

6. Constructors are executed.

Tuesday, December 2, 2008

The meaning of static

With the this keyword in mind, you can more fully understand what it means to make a method static. It means that there is no this for that particular method. You cannot call non-static methods from inside static methods (although the reverse is possible), and you can call a static method for the class itself, without any object. In fact, that's primarily what a static method is for. It's as if you're creating the equivalent of a global method. However, global methods are not permitted in Java, and putting the static method inside a class allow it access to other static methods and to static fields.

Monday, December 1, 2008

Some common data structures.

Array: An array is a data structure consisting of a group of elements that are accessed by indexing. In most programming languages each element has the same data type and the array occupies a contiguous area of storage.
on Wiki
Deque: A deque is an abstract list type data structure, also called a head-tail linked list, for which elements can be added to or removed from the front(head) or back(tail).
on Wiki
Heap: A heap is a specialized tree-based data structure that satisfies the heap property.
on Wiki
Linked list: A linked list is one of the fundamental data structures, and can be used to implement other data structures. It consists of a sequence of nodes, each containing arbitrary data fields and one or two reference("links") pointing to the next and/or previous nodes. The principal benefit of a linked list over a conventional array is that the order of the linked items may be different from the order that the data items are stored in memory or no dist, allowing the list of items to be traversed in a different order. A linked list is a self-referential datatype because it contains a pointer or link to another datum of the same type. Linked lists permit insertion and removal of nodes at any point in the list in constant time, but do not allow random access. Several different types of linked list exist: singly-linked lists, doubly-linked lists, and circularly-linked lists.
on Wiki
Queue: First-In-First-Out
on Wiki
Stack: Last In First Out
on Wiki

Introduction to objects

I like the example of objects in the classic book of java: Thinking in Java.
----------------
Type Name:
| Light |
----------- ----
Interface:
| on(); |
| off(); |
| brighten(); |
| dim(); |
------------------

| Light lt = new Light();
| lt.on();

The interface determines the requests that you can make for a particular object. A type has a method associated with each possible request, and when you make a particular request to an object, that method is called.

Here, the name of the type/class is Light, the name of this particular Light object is lt, and the requests that you can make of a Light object are to turn it on, turn it off, make it brighter, or make it dimmer. You create a Light object by defining a "reference"(lt) for that object and calling new to request a new object of that type. To send a message to the object, you state the name of the object and connect it to the message request with a period(dot).

One problem people have when designing objects is cramming too much functionality into one object. For example, in your check printing module, you may decide you need an object that knows all about formatting and printing. You'll probably discover that this is too much for one object, and that what you need is three or more objects. One object might be a catalog of all the possible check layouts, which can be queried for information about how to print a check. One object or set of objects can be a generic printing interface that knows all about different kinds of printers. And a third object could use the services of the other two to accomplish the task. Thus, each object has a cohesive set of services it offers. In a good object-oriented design, each object does one thing well, but doesn't try to do too much.

What will I do today 2008.12.1

December 1, the new start for a new month, the last month in this critical year. I've had a clear goal now. What I need to do is doing my best to reach it as soon as possible. Keith, I am sorry I disappoint you again. I swear I'll do it never. Although I cannot help with the situation, I swear I will do my best in every thing if only I start to do it. Let's see.

Today, my plan is to rewrite the report for you, in which all the thoughts we discussed before would be described in length. I try to finish it today. If not, the deadline is 12:00pm tomorrow.

Let's go from now on.
-----------------------
Sigh, I only finished the first three chapter of Thinking in Java. Should I say sorry? No. To whom? No regret, just put more attention and effort in the daily work.

To be a bioengineer

An email from the research group of Cytoscape appeared in my mailbox this morning. It said they offered an position for the bioengineers who are expert in Java. I clicked on the link they listed, and saw these: Hiring Salary Range: $56,855 ~ $77.145 /year

I have no idea if it's enough for the living cost in San Diego,but I feel it should be a good salary: around $6,000 per month. Although it's so far away for me to be a so-called programming expert, I'd like to regard it as a very good drive for the possible comfortable life abroad in the future.

Yes, work harder and harder, as hard as possible. Go!

The link: Cytoscape Java programming position in UCSD