https://arxiv.org/abs/2007.14062
Big Bird: Transformers for Longer Sequences
Transformers-based models, such as BERT, have been one of the most successful deep learning models for NLP. Unfortunately, one of their core limitations is the quadratic dependency (mainly in terms of memory) on the sequence length due to their full attent
arxiv.org
https://www.youtube.com/watch?v=WVPE62Gk3EM&ab_channel=YannicKilcher
한국어로 된 거 못 찾아서 죄송함니다 ㅠㅠ 논문들 읽어오셔요
https://github.com/monologg/KoBigBird
GitHub - monologg/KoBigBird: 🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)
🦅 Pretrained BigBird Model for Korean (up to 4096 tokens) - GitHub - monologg/KoBigBird: 🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)
github.com
11월 22일, 23일 - SentenceBERT (0) | 2021.11.22 |
---|---|
Bigbird (0) | 2021.11.18 |
BertGCN (0) | 2021.11.15 |
11월 15일, 16일 주제 - BertGCN (0) | 2021.11.14 |
Electra (0) | 2021.11.09 |
댓글 영역