Show HN: Proxy Block-CAGE, a new sparse block attention
Hi, I'm a PhD student in Bioinformatics/Computational Biology with a software engineering background. I'm trying to pivot toward AI/ML research. I'm familiar with the practical side of AI as in using Scikit learn, R, Pytorch, ONNX Runtime etc. I was thinking if LLMs could be used as a research assistant to create better AI/ML algorithms. So I asked ChatGPT to help find better ways to solve one of the most computationally intensive problems in Transformer architecture based model. I instructed ChatGPT to use genetic algorithms, genetic programming and other optimization techniques (Something I use extensively in my bioinformatics research) to find better Attention methods in transformers and this was the result. I would love to get feedback and comments from the AI/ML research community. Please read the md file and get back to me. Hope you find this useful. Thank you. Read more here
