Context Navigation

Changes between Version 28 and Version 29 of Other/Summer/2024/lLM

-              v28
+              v29
 . **Research Paper**: We read and annotated a Google DeepMind paper on Weighted Average Reward Models (WARM), a novel approach to develop and train reward models to mitigate reward hacking. The paper discusses the advantages of WARM over more traditional methods such as ensembling, which take the average output of various individual models, whereas WARM provides a single output using the weights and biases corresponding to multiple models. We aim to present this paper to Dr. Ortiz and team at our weekly meeting next Tuesday.
+.
+. **Sensor Testing**: We also spent a considerable portion of our time testing out the Maestro sensor suite, and verifying that all the sensors can transmit meaningful data to the Testbed server. We tested the following sensors: Accelerometer, Humidity, Temperature, Air Quality, Infrared Motion, RGB, as well as Audio. All of the sensors were working optimally, except audio, which kept returning values containing high entropy. We thoroughly tested the audio sensors by introducing extremely loud stimuli for a short period of time, and maintaining a near silent environment in between the stimuli. However, the streamed values showed no significant shift. This is currently a work in progress.
+. **Loguru Logging**:
 == Week 5