Difference between revisions of "Projects:Fish heart and breath rate sensing"

From Collective Computational Unit
Jump to navigation Jump to search
m (Aims)
 
(8 intermediate revisions by one other user not shown)
Line 11: Line 11:
 
== Aims ==
 
== Aims ==
  
The aim of the project is to calculate the frequency of hearth and breath rate of a fish.  
+
The aim of the project is to calculate the frequency of heart beats and breathing of a fish.
  
 
== Provided data ==
 
== Provided data ==
Line 17: Line 17:
 
== Suggested/tested approaches ==
 
== Suggested/tested approaches ==
  
The pipeline consists of multiple steps where each step has a drastic influence on the resulting quality. The stage consist of a single fish or each fish can be tracked individually and independently on other fishes present in the scene. From the tracking stage, we have a bounding box. The image within the bounding box is then fed to a minimalistic encoder-decoded network where the latent space is used to detect different stages of the fish (either hearth or beat rate).
+
The pipeline consists of multiple steps where each step has a drastic influence on the resulting quality. The stage consist of a single fish or each fish can be tracked individually and independently on other fishes present in the scene. From the tracking stage, we have a bounding box. The image within the bounding box is then fed to a minimalist encoder-decoded network where the latent space is used to detect different stages of the fish (either hearth or beat rate).
  
The architecture of the network is very minimalistic due extremely small training dataset (3x3 consecutive convolutions, the image is rescaled to 64x64 pixels with three channels) which on average consist of no more than 2000 frames. The overfitting is prevented by using a very short training stage (100 iterations is incomparably less than what is usually used). The bounding boxed from different videos cannot be used because of the setting (fishes, scene) or alignment is different for each video.  
+
The architecture of the network is very minimalist due extremely small training dataset (3x3 consecutive convolutions, the image is re-scaled to 64x64 pixels with three channels) which on average consist of no more than 2000 frames. The over-fitting is prevented by using a very short training stage (100 iterations is incomparably less than what is usually used). The bounding boxed from different videos cannot be used because of the setting (fishes, scene) or alignment is different for each video.  
  
 
Latent space usually consists of multiple dimensions, some of them are redundant, but we calculate the average of the entire latent space for each frame. The absolute value provides us a rough clue of at which stage the fish is.
 
Latent space usually consists of multiple dimensions, some of them are redundant, but we calculate the average of the entire latent space for each frame. The absolute value provides us a rough clue of at which stage the fish is.
 +
 +
The popular approach of embedding (TSNE) seems not to work unfortunately (see results, left figure)
 +
 +
== Code ==
 +
 +
[[File:Autoencoder.zip|thumb|autoencoder]]
  
 
== Results ==
 
== Results ==
 +
 +
 +
{|
 +
| [[File:feeding_2_30sec_TSNE_graphs.png|thumb|(Top) TSNE Embedding of the latent space to 2D after being clustered with EM (the different color is a different cluster) (Bottom) Assignment of state to each frame based on clustering of the latent space]]
 +
| [[File:Feeding 2 30sec graphs.png|thumb|(Top) Sum of the absolute value of the latent space (Bottom) Filtered lower frequency, the red dots are detected peaks which should correspond to the detected breath phase]]
 +
| [https://www.dropbox.com/s/lpe12dp3jatotbk/feeding_2_30sec_res.avi?dl=0 Video]
 +
|}
 +
 +
{|
 +
| [[File:Feeding 5 30sec TSNE graphs.png|thumb|(Top) TSNE Embedding of the latent space to 2D after being clustered with EM (the different color is a different cluster) (Bottom) Assignment of state to each frame based on clustering of the latent space]]
 +
| [[File:Feeding 5 30sec graphs.png|thumb|(Top) Sum of the absolute value of the latent space (Bottom) Filtered lower frequency, the red dots are detected peaks which should correspond to the detected breath phase]]
 +
| [https://www.dropbox.com/s/6ax3kae29o5mjbu/feeding_5_30sec_res.avi?dl=0 Video]
 +
|}
 +
 +
{|
 +
| [[File:Feeding 6 24sec TSNE graphs.png|thumb|(Top) TSNE Embedding of the latent space to 2D after being clustered with EM (the different color is a different cluster) (Bottom) Assignment of state to each frame based on clustering of the latent space]]
 +
| [[File:Feeding 6 24sec graphs.png|thumb|(Top) Sum of the absolute value of the latent space (Bottom) Filtered lower frequency, the red dots are detected peaks which should correspond to the detected breath phase]]
 +
| [https://www.dropbox.com/s/qevcqmsklavk1qv/feeding_6_24sec_res.avi?dl=0 Video]
 +
|}
 +
 
{|
 
{|
[[File:Feeding 2 30sec graphs.png|thumb|(Top) Sum of absolute value of the latent space (Bottom) Filtered lower frequency]]
+
| [[File:Isolation 3 29sec TSNE graphs.png|thumb|(Top) TSNE Embedding of the latent space to 2D after being clustered with EM (the different color is a different cluster) (Bottom) Assignment of state to each frame based on clustering of the latent space]]  
|[[File:Feeding 5 30sec graphs.png|thumb|(Top) Sum of absolute value of the latent space (Bottom) Filtered lower frequency]]
+
| [[File:Isolation 3 29sec graphs.png|thumb|(Top) Sum of the absolute value of the latent space (Bottom) Filtered lower frequency, the red dots are detected peaks which should correspond to the detected breath phase]]
[[File:Feeding 6 24sec graphs.png|thumb|(Top) Sum of absolute value of the latent space (Bottom) Filtered lower frequency]]
+
| [https://www.dropbox.com/s/aapw1hea1ee78r1/isolation_3_29sec_res.avi?dl=0 Video]
|-
 
|[[File:Isolation 3 29sec graphs.png|thumb|(Top) Sum of absolute value of the latent space (Bottom) Filtered lower frequency]]
 
|[[File:Isolation 3 30sec graphs.png|thumb|(Top) Sum of absolute value of the latent space (Bottom) Filtered lower frequency]]
 
 
|}
 
|}

Latest revision as of 18:11, 23 May 2019

Overview

Contact

Aims

The aim of the project is to calculate the frequency of heart beats and breathing of a fish.

Provided data

Suggested/tested approaches

The pipeline consists of multiple steps where each step has a drastic influence on the resulting quality. The stage consist of a single fish or each fish can be tracked individually and independently on other fishes present in the scene. From the tracking stage, we have a bounding box. The image within the bounding box is then fed to a minimalist encoder-decoded network where the latent space is used to detect different stages of the fish (either hearth or beat rate).

The architecture of the network is very minimalist due extremely small training dataset (3x3 consecutive convolutions, the image is re-scaled to 64x64 pixels with three channels) which on average consist of no more than 2000 frames. The over-fitting is prevented by using a very short training stage (100 iterations is incomparably less than what is usually used). The bounding boxed from different videos cannot be used because of the setting (fishes, scene) or alignment is different for each video.

Latent space usually consists of multiple dimensions, some of them are redundant, but we calculate the average of the entire latent space for each frame. The absolute value provides us a rough clue of at which stage the fish is.

The popular approach of embedding (TSNE) seems not to work unfortunately (see results, left figure)

Code

File:Autoencoder.zip

Results

(Top) TSNE Embedding of the latent space to 2D after being clustered with EM (the different color is a different cluster) (Bottom) Assignment of state to each frame based on clustering of the latent space
(Top) Sum of the absolute value of the latent space (Bottom) Filtered lower frequency, the red dots are detected peaks which should correspond to the detected breath phase
Video
(Top) TSNE Embedding of the latent space to 2D after being clustered with EM (the different color is a different cluster) (Bottom) Assignment of state to each frame based on clustering of the latent space
(Top) Sum of the absolute value of the latent space (Bottom) Filtered lower frequency, the red dots are detected peaks which should correspond to the detected breath phase
Video
(Top) TSNE Embedding of the latent space to 2D after being clustered with EM (the different color is a different cluster) (Bottom) Assignment of state to each frame based on clustering of the latent space
(Top) Sum of the absolute value of the latent space (Bottom) Filtered lower frequency, the red dots are detected peaks which should correspond to the detected breath phase
Video
(Top) TSNE Embedding of the latent space to 2D after being clustered with EM (the different color is a different cluster) (Bottom) Assignment of state to each frame based on clustering of the latent space
(Top) Sum of the absolute value of the latent space (Bottom) Filtered lower frequency, the red dots are detected peaks which should correspond to the detected breath phase
Video