wiki:Other/Summer/2023/Features

Version 69 (modified by KatieLew, 16 months ago) ( diff )

Neural Networks For Feature Analysis

Introductions

Mayank Barad
Rising Senior pursuing BSE in Computer Engineering and Computer Science

Daksh Khetarpaul
Rising Junior pursuing BSE in Computer Engineering

Katherine Lew
Rising Sophomore pursuing BS in Finance and Computer Science

Advisors - Dr. Richard Howard, Dr. Richard Martin

Project Poster

Project Description

Neural networks have a long history of being used for classification, and more recently content generation, Example classifiers including, image classification between dogs and cats, text sentiment classification. Example generative networks include those for human faces, images, and text. Rather than classification or generation, this work explores using networks for feature analysis. Intuitively, features are the high level patterns that distinguish data, such as text and images, into different classes. Our goal is to explore bee motion datasets to qualitatively measure the ease or difficulty of reverse-engineering the features found by the neural networks.

Week 1 Progress

  • Defining objectives: We defined the objective of our project: to explore how well behavioral anomalies and patterns in bees are recognized by a neural network.

Neural Networks for Feature Analysis and Hive Monitoring are interrelated. The Hive Monitoring project is based on the hypothesis that changes in the earth's magnetic fields due to radio waves are affecting bee behavior. Their goal is to determine if bees exposed to a magnetic field behave differently than un-exposed bees. Our goal is to explore the use of Neural Networks to detect these changes.

How effective are neural networks at detecting minute changes in bee behavior?

  • Set up software: We set up Github and iLab accounts to collaborate and run the machine learning behavioral detection code, respectively.
  • Research neural networks: We researched neural networks to become familiar with the model. Looked into concepts like , , and .

Week 2 Progress

  • Observation: Visited the beehive to observe the behavior of real bees
  • Simulation: Made a prototype simulator with pygame - Rejected(pretty obvious reasons)
  • Power Law: Integrated "Power Law" for a more natural bee motion

bee garage First Prototype → Applying "Power Law" →

https://docs.google.com/presentation/d/1CUsZIpquM0ZV0PSCLitq5EsfMTJz4iVAQ4b3pk8_gQI/edit#slide=id.g20a62be0e34_0_0

Week 3 Progress

  • Randomness Function:We programmed a function that allows the user to adjust the degree of randomness of synthetic bee motion along a spectrum. 0.0 represents the "bee" moving in a completely random motion, and 1.0 represents the "bee" moving via a distinct non-random pattern like a clockwise circle.
  • Train model: We used the randomness function to trained the machine learning model (AlexNet adjacent) to try to detect the difference between the random and non-random behavioral patterns. The model outputted a confusion matrix and an accuracy of 0.798 in identifying randomness.
  • Shannon's Entropy: We researched Shannon's Entropy Function as a measure of the model's accuracy and created a program that automates the calculation of the joint entropy of two discrete random variables within the random system (e.g angle and distance)

https://docs.google.com/presentation/d/1_dZirzqPIkBOsUDAHdJSiHIpVgf-TdTRbP-K97CZeXM/edit#slide=id.g255fda0ef8b_0_458

Weeks 4 and 5 Progress

  • Validate results: We discovered that there was a mistake in our training data, so last week's training results were null. There was a bias in the input data, and irrelevant learning happened.
  • Retrain model: We retrained the machine learning model using simpler test cases, like the black-white frame test. With simple black-and-white classes, our model obtained 100% accuracy. With more complicated classes, our model obtained 98% accuracy.
  • Reformat tar files: We altered the program to reformat the training data. Instead of combining the frames of the random bee simulator into a video format, we compiled the data into a tar file, which consists of a png, a class, and a metadata file for each frame in the simulation. We will use these tar files as training data for the model.

https://docs.google.com/presentation/d/10LL2DQ5wy2e9MNfBmUn3hnLoixhM3UpnMleaLXY5jwk/edit#slide=id.g2587e9ef81d_0_172

Week 6 Progress

  • Time Varying Features: In order to train the model to capture time-varying features (motion), we increased the channels while keeping the same kernel size. This works for small movements in the training data.
  • Clockwise-Anticlockwise Test: With the time-varying features accounted for, we began to train the model with patterns of motion instead of simple black-and-white frames. For instance, we created training data with one class of frames that move in a clockwise direction and one class of frames that move in a counterclockwise direction. Can the model detect left versus right rotations?
  • Entropy v. Accuracy Graphs: We created a graph from our model output data to derive the relation between entropy versus accuracy.

https://docs.google.com/presentation/d/13kuM64opDaoH3wVpFatiwiS7849zpjlLzW0q8wsVIVM/edit#slide=id.g259118e31d9_0_229

Week 7 Progress

  • Testing complicated patterns: We tested training data with more complicated patterns of movement. For instance, we trained the model with one class of frames moving in a completely random pattern and the other moving with a 3-degree bias to the right. After running 10,000 samples, we obtained 50% accuracy. We hypothesize 3 potential causes of the low accuracy: 1) sample size was too small 2) there was a problem in our datasets 3) there was an error in executing the software stack
  • Testing with new bias: To rectify the issue, we changed the bias from 3 degrees to 30 degrees. The model was able to identify the change in bias with a 93% accuracy on 10,000 samples.

https://docs.google.com/presentation/d/14P4f2n2sCoIkeknVfsmpHFdldTG9xznclu4V_xbCtFc/edit#slide=id.g25a7ea5071e_0_229

Week 8/9 Progress

  • Slurm: Each training run of a dataset takes about 2-3 hours and generates one point (one accuracy and one bias) on the accuracy v. bias graph. Since many points are necessary to identify a consistent pattern between accuracy and bias, the graph could take days to create. We decided to use Sclurm workload manager to submit multiple jobs to iLab that run in parallel saving massive amounts of time.
  • Automation of model training: We wrote a script that generates several datasets, trains the model using each of those datasets (with Sclurm), and outputs a final graph plotting accuracy v. bias (one point on the graph for each dataset). Since the generation and training process takes hours, this script saves the time required in manually running the model for each dataset.
  • Testing the simulator: We tested the model with a low bias (8 degrees) and obtained an abnormally low accuracy of 50%. To fix this issue, we decided to look back at the code and test the simulator itself to see if it is rotating or not rotating in the way we intended. We wrote a program with a count variable to keep track of the number of left versus right turns in each dataset and display a bar graph with these counts. For instance, in the "no bias" dataset the left and right bars are even because there is no bias, and the number of left turns equals the number of right turns. In the "bias" dataset, the bar for the right turns is significantly higher than that of the left turns, revealing a bias in the dataset that causes the bee to turn to the right. This graph confirms the legitimacy of our simulator (creating one biased and one non-biased dataset).

https://docs.google.com/presentation/d/14_O6oJ1Z98k6zDMZ46I_GgvBa61mNDzJoDTEpYM8tX4/edit#slide=id.g25dd690f981_0_229

Attachments (30)

Note: See TracWiki for help on using the wiki.