Weekly Journal

Week 1
For the first few days in the MedIX program, we all came together in a meeting room and went over some of the basics of image processing and introductions to some of the research projects. None of the non-DePaul students had ever had any experience with image processing, so it was new to most of us, but for the most part very straight forward. We got a crash course in Matlab and learned how to get a feel for the environment we'd be using for the rest of the summer. At this point I was fairly interested in the classification side of things, since I had a bit of a background in that area, and also the volumetric texture analysis since there didn't seem to be much done in that area yet. I wasn't as interested in the proposed positions opened at Northwestern since it seemed like it had much less room for research. In the end I was placed on the volumetric texture analysis group with Brian since we were the only few interested in the area.

Week 2
Brian and I started to ponder the possibilities of volumetric texture analysis and read up on what other techniques had been taken so far. We looked through the code of previous people who had worked on similar projects and were able to see what they were doing and reproduce their results as well. There were definate problems with not having any quality segmented images, which had the potential to really hinder the classification process. Brian and I started to look into different types of volumetric segmentation and we landed on region growing as a prime candidate. We implemented our own region growing algorithm that was fairly simplistic. Just a few texture descriptors to compare to start off with. We ran some tests and we were able to fill regions! Not extremely well just yet, but it was a start

Week 3
We took the work we had done the previous week and tried to improve it. We did this in a few ways. We wrote out a few different methods of calculating co-occurrence matrices in a search to find the most efficient one. Eventually we also saw a need to be more inclusive towards pixels that were on the edge of a given region, as they may somewhat deviate from those in the center, but still be within the region. We came up with the idea of a weighted moving average to incorperate with our breadth-first-search in order to be more accepting to slight changes. In a search to further improve our algorithm we implemented somewhat of a priority-queue into our breadth first search so that the most similar pixels are included into our region first and foremost, so that they are not included by a somewhat less resembling pixel in the region, therefore slightly tainting our moving average. Results are now looking much more promising, but there is still a good bit of work to be done.

Week 4
This week we wanted to clean up and refine some of our output. We had a few leads and a few ideas and decided to run with them. One of the things we wanted to try to do was clipped binning. Implementing this seemed to help quite a bit and allowed our region growing algorithm to differentiate more between neighboring pixels. We then looked into various image post-processing techniques and decided to dialate and blur our binary images that represent the segmented region in order to fill in gaps and smooth out our edges. The results we received were really quite impressive. The regions very much looked like the organs we were segmenting. With a tip from Dr. Raicu, we tried using another method to find the difference between pixels. She suggested that we use the Jeffrey Divergence instead of the Eculidean Distance. The Jeffrey Divergence is a LOT more computationally expensive and really makes our algorithm run about 3-4 times longer, but the results are a LOT better. The regions are naturally more filled in and spilling out of the region before the region has been filled is very much less likely. We have also experimented with other texture descriptors in order to use more data to distinguish the differences between neighboring pixels. We're still a bit puzzled as to how to normalize some of the texture descriptors. We took a large random sample of pixels and calculated all 10 texture descriptors for each in order to find the distribution of values for each texture descriptors. 4 of the 10 Texture descriptors lean towards normal distributions, while the other 6 are more logarithmic. Once we figure out how to normalize these we may be able to use the texture descriptors to further compare neighboring pixels.

Week 5
Presentations were the focus of the beginning of our fifth week. We spent all of tuesday presenting to one another and to a few other high-profile DePaul folk. As far as research was concerned, Brian and I made quite a bit of progress in terms of the normalization of texture descriptors. To this point we had been following the example of everyone else, that being Min Max Normalization. We were having problems with this as the distribution of the texture descriptors were far from uniform. Of the 10 texture descriptors we looked over, we found that 4 were somewhat normally distributed and the other 6 had exponential distributions. This caused problems when trying to compare deviations in texture descriptors as each had different distributions, some to an exponential degree. We then decided to examine the differences in texture descriptors by finding their z-scores in their distributions and noting the probability between the two. This worked incredibly well for the distributions that were fairly normal, but we still had problems with the exponential distributions, so by taking a log transform we were able to make those distributions represent normal distributions. This provides a distance metric that is much more accurate and has provided Fastastic results! Our results used to be somewhat sparse in how the pixels were filled in, so as there was a bit of an outline of the organ and somewhat sparsely filled in by pixels. The results are now perfectly filled in. Without any post-image processing we are able to get inredibly cool results. One of the other things we have been examining is the ability to keep track of how quickly the distances deviate from the seed. We have recently tried recording the deviation of each pixel in our region to that of the seed, concerning pixel-level texture descriptors, and we have made histograms of our findings. When doing this we have purposely filled beyond the boundaries of our region in order to examine that histogram so the we can find traits or characteristics of when we leak from our region. This is where it gets really exciting. While looking at the histogram, which due to our prioritized growth represents deviation and time on the same axis, we have been able to identify when leaking has occured by finding a sharp cut-off in similar deviations and an outward expansion to increased deviations. By looking at these histograms to identify points of such leakage, we have been able to successfully segment many organs without ever seeing intermidiate images. So essentially we can get a great segmentation without ever looking at what we're segmenting. We figure we can create an algorithm to find that cut-off point in our histogram for us so that our program will automatically know when to stop in order to avoid leaking out of our region. Good times...

Week 6
We made a bit of progress concerning the normalization of texture descriptors. Some of the texture descriptors were already fairly normal in their distribution but others were exponential. In order to use the same Mahalanobis-style method on the exponential distribution that we do on the normal ditsributions, we applied a log tranform to the exponential functions and the very much resembled normal distributions so we were able to use the Mahalanobis-style method on our 4 texture descriptors: Sum Mean, Entropy, Cluster Tendency, and Variance. We also started looking into our priority-queue structure, which holds all voxels in consideration as well as voxels in the region and by doing this we are able to examine the rate at which voxels similar to the region are added. In doing this we also pretty much ditched the weighted moving average and focused just on the seed so would could calculate a non "moving" histogram. The histogram appeard to sort of generate bell shaped structures for regions that it filled. We then decided it would be of value to generate a stopping criterion from this information by looking to the point of inflection of the bell curve as a stopping threshold.

Week 7
In order to make our algorithm less seed-specific we changed our priority-queue to order by similarity to the computed region average. We then had to figure out how to represent our histogram. How far it should extend and how many "bins" we should be using and had to decide how we would calculate the slope in order to find the point of inflection. We considered taking the slopes of a bin and the previous couple of bins and average that to reduce the effect of noise. We actually ended up ditching the whole entire point of inflection idea because the regression of regions was not necessarily similar and we resorted to finding the peak of the bell curve. We notice that in finding the peak of a histogram representing a region, we would have pretty much found half of the voxels within the region. So then we decided to use that as the stopping criterion. We didn't have the need to fill the entire region, because when we tried to do so we were much more apt to leaking. Instead we could fill only a portion of the region, which well defines the shape of the region, and then use morphological operators to fill the region and enhance the shape. By using this peak, we have gotten incredible results! We have, so far, successfully segmented pretty much everything we wanted perfectly with the exception of a spleen in some of the new data we got. The seems to be much less contrast in the new data in that area in relation to other data. It's really wierd. The new data is more isometric but doesn't seem to be of the quality of the old data. With the new data it's often very difficult to distiguish organs with your eyes, just by looking at the ct scan, where it was very easy to do so with all other data. We also submitted to a conference at the beginning of the week so that was pretty cool.

Week 8
Well, we submitted to another conference this week so that's pretty cool again. We also did some MAJOR optimization of our code. We were experiencing slow runtimes with large organs because our pritority-queue was so huge and we had to keep iterating through the whole queue. We made a little lookup table that allows a voxel to jump to the aproximate area in the queue that it should be at and this did Wonders! A spleen would have taken 48 hours to computer and now it's done in like 3. Quite wonderful. I also did a major overhaul in terms of memory management. We were totally running out of RAM and weren't able to load as many images as we needed in order to do things like segment all of the liver at once. I did some tricky stuff there, and now it seems like we'll never have a memory problem agian! Brian has been working on a GUI for our program which is pretty awesome and is very handy. On a side not I was looking through some ITK stuff in order to change some raw volume data I got for Dr. Raicu into 2d dicom slices. This was giving me problems for a LONG time but I was finally able to hack something together (without ITK and in Matlab actually!). I was getting a ton of problems because I guess I made assumptions about the dicom format based on what I saw from Matlab. Matlab always presented dicom values between 0 and 4095 for dicom files, and that was totally untrue. That's just how Matlab represented the data. I just assumed that I should have been reading in 16bit Unsigned ints, but no no I needed signed data. So now I've got some pretty cool matlab code that reads in an mhd header file that corresponds to an img file and it puts together some 2d dicom slices. Cool stuff. At least I think so...

Week 9
This week we did a lot of polishing to our algorithm and code. We looked to different methods of filtering our histogram data and then we came to the conclusion that filtering wasn't the option we should be looking to. We had originally been trying to look to the peak and/or point of inflection to find a stopping place along the histogram, but in finding the point of inflection of the curve it was necessary to calculate many slopes. Calculate slopes in a histogram is pretty darn dificult! Well... it's difficult to do well at least! The problem is that the histogram doesn't provide a truly smooth curve. There is some noise in it. This means that to find the slope at a point you may have to average multiple slopes in the aproximate area, and even do so filtering DSP style in order get rid of wierd stuff. In the end we ended up just ditching the point of inflection method and even filtering the histogram at all. We resorted to raw data and observing the peak of the histogram. An idea we had toyed around with a few weeks ago was looking for a percent drop from the peak. That would ensure that we utilized most of the curve and allow us to maintain a similar proportion of the amount of each region we fill. We also figured out a cool way to improve our histogram. We found that with an increased bin size we would be able to increase our accuracy when determining a stopping point. But a large amount of bins at early stages would give our histogram a fairly wacky distribution. So we developed a bin expansion concept that starts with a given number of bins and for every (power of 2)*1000 number of voxels we've got in our region, we add another 10 bins. This allows us to increase our number of bins at a reasonable rate.

Week 10
This week has mainly been used for tying up loose ends and presentations. The first part of the week was focused mainly on expanding and improving our presentations. I think our presentation has come a long ways in terms of understandability. We then all presented our work, had some good food, and presented some more. After the presentations we actually had some more good food at an Indian restaurant. Quite nice and fun as well. Brian and I ended up writing quite a bit of documentation for our algorithm, and we did a fair bit of code commenting as well as general organization of all of our papers/code/data/results and things of the sort. It's been a pretty fun summer. We accomplished quite a bit.