The holidays may be over, but the presents are still arriving from the UK Smalltalk User Group! Videos of previous presentations have been released over the past month, covering a variety of interesting topics–a total of 57 as of this writing! Be sure to check out their new YouTube channel at https://www.youtube.com/@UKSTUG. Also, be sure to visit their homepage at https://www.uksmalltalk.org/, and if you would like to attend any meetings, their Meetup site can be found at https://www.meetup.com/ukstug/.
Have a great time with Smalltalk and keep on Squeaking!
In a recent email to the Squeak developers mailing list (here), Tim Rowledge shared insights from the Raspberry Pi team regarding beneficial tweaks to memory configuration in the firmware, specifically focusing on NUMA (Non-Uniform Memory Access). These updates are part of ongoing efforts to enhance SDRAM performance for both Raspberry Pi 4 and 5 models.
Performance Enhancements Explained
Recent testing revealed that the 8GB models sometimes performed worse than the 4GB models due to SDRAM self-refresh consuming bandwidth, especially since larger sizes require longer refresh times. Investigations showed that adjusting the SDRAM refresh interval could yield better results. Monitoring temperature indicated that a faster refresh rate was feasible, which helped reduce overhead. Micron confirmed that 8GB SDRAM could safely operate with 4GB refresh timings.
Ongoing tweaks to SDRAM and ARM settings have led to small but cumulative performance improvements, typically around 1% per update. A significant issue noted is the competition among multiple ARM cores accessing SDRAM, leading to inefficiencies when multiple pages in the same bank are accessed. Implementing NUMA can help manage this by splitting SDRAM into regions, allowing for more efficient allocation and improved performance in multi-core tasks.
Benchmarking Squeak Smalltalk
Tim tested these configurations on a Raspberry Pi 5 equipped with NVMe storage, running benchmarks from the Benchmark Shootout suite. The tests included:
nbody
binary trees
chameneos redux
thread ring
Results of the NUMA Configuration
The performance comparisons between a NUMA-configured Raspberry Pi 5 and a standard setup yielded notable results:
nbody: Improved from 5.157 seconds to 5.095 seconds (1.2% improvement)
binary trees: Improved from 3.398 seconds to 3.096 seconds (8.9% improvement)
chameneos redux: Improved from 7.274 seconds to 5.239 seconds (28% improvement)
thread ring: Improved from 8.347 seconds to 7.783 seconds (6.7% improvement)
These findings indicate that even minor adjustments in RAM timings can lead to substantial performance gains, particularly highlighted in the chameneos redux benchmark.
Conclusion
Tim Rowledge’s testing of the Raspberry Pi team’s memory configuration tweaks has revealed valuable performance gains. By implementing NUMA and optimizing SDRAM settings, Raspberry Pi users can unlock significant benefits. These small adjustments can lead to meaningful improvements, making the Raspberry Pi an even more effective tool for development and education. You can find Raspberry Pi team’s recommended tweaks here. Additionally, you can explore the bug report thread detailing the testing and findings here.
Have a great time with Smalltalk and keep on Squeaking!
In this episode we talk to tim Rowledge about his work on Smalltalk VMs over the years, especially for the RISC OS Platform and ARM machines.. The latest and probably hottest thing in this arena is his port of Squeak to the Raspberry Pi. This is not only cool in itself, but more importantly enables Raspberry Pi users to use Scratch and EToys on this little machine on RISC OS (the Raspbian Linux version existed before). You can probably imagine how much fun we had in recording this session.