Comments (4)
So, what idioms are we using?
- 5 idioms from band 1. Collect five contexts for each idiom.
- 5 idioms from band 2
- 5 idioms from band 3
- 5 idioms from band 4
from idiomify.
Wait, we have another corpus-based study on the most frequently used idioms
Let's pick the top five from each genre.. maybe?
Some of them should include opaque idioms. -> But again, you don't know which one will be "opaque", as this would be rather subjective.
from idiomify.
These are my selections
idioms |
---|
every last |
all of a sudden |
on the air |
the bottom line |
big deal |
at the end of the day |
on the record |
last thing |
behind the scenes |
in the long run |
big deal |
meet someone's eye |
look someone in the eye |
on someone's mind |
once in a while |
let it go |
a hell of a |
out of one's mind |
look somone up and down |
be history |
the likes of |
for free |
go public |
fall short of |
from scratch |
leave one's mark |
set the stage |
pave the way for |
the old days |
do away with something |
in the light of |
as it were |
close to |
if anything |
by and large |
come to mind |
on the horizon |
by the same token |
out of key |
to name but a few |
There are 40 idioms in total. I tried to choose 10 idioms from each corpus.
from idiomify.
Now..
from idiomify.
Related Issues (19)
- Implementing `m-1-1` - the first baseline HOT 7
- `d-1-4` : Preprocess PIE dataset to build NER labels for Idiomify task HOT 1
- `entities:d-1-4`: define the entities
- : Idiomfier as an NER tagger HOT 1
- Implement `Idiomifier`class with OpenAI's GPT-3 API HOT 3
- fine-tune Davinci-002 HOT 9
- Implementing `m-1-2` - testset, metrics, deploy HOT 2
- build `literal2idiomatic` dataset (at least three exemplar usages for each idiom) HOT 3
- add a simple password check HOT 1
- Why does fine-tuning perform worse on the same data? HOT 4
- add password check
- remove the special tokens
- `main_infer.py`: don't split sentences HOT 1
- login with OpenAI token HOT 2
- Chronicles HOT 1
- `d-1-3` : PIE dataset - annotate the idioms with special tokens and add their definitions to `idioms` artifacts HOT 1
- `t-1-1` : Saving a pre-trained `BartTokenizer` with the special tokens (`<idiom>` , `</idiom>`)
- `m-1-3`: The same as `m-1-2`, except that it prints out the special tokens before and after the idioms. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from idiomify.