注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

jinchangge的博客

趣味大学英语

 
 
 

日志

 
 

免费的英语语料库汇总 Free English Corpora  

2010-06-28 18:06:45|  分类: 语料库 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

The list is constantly updated.

Strictly speaking, some of them are not corpora, but archives, databases or even dictionaries. Since some are a collection of corpora, overlaps are inevitable in my classification. Some naturally belong to different categories.


I highly recommend SKELL Corpus to non-experts: http://skell.sketchengine.co.uk/run.cgi/skell

For corpus linguists, NOW Corpus is strongly recommended:

http://corpus.byu.edu/now/

1. Large/Generic Corpora

NOW Corpus ( News on the Web): http://corpus.byu.edu/now/

Corpus of Global Web-Based English (GloWbE): http://corpus.byu.edu/glowbe/

CORPUS OF CONTEMPORARY AMERICAN ENGLISH

(COCA)http://corpus.byu.edu/coca/

 http://www.wordandphrase.info/frequencyList.asp 

CORPUS OF HISTORICAL AMERICAN ENGLISH (COHA): http://corpus.byu.edu/coha/

Download N-Grams from COCA and COHA: http://www.ngrams.info/

TIME Corporahttp://corpus.byu.edu/time/

                                http://111.200.194.212/cqp/time/   name: test  password: test 

Independent Corpus: http://111.200.194.212/cqp/independent/  name: test  password: test 

Bank of English (BoE): http://www.collins.co.uk/page/Wordbanks+Online 1 month free trial

Oxford English Corpus (OEC): freely available to few researchers:

http://www.oxforddictionaries.com/us/words/the-oxford-english-corpus

International Corpus of English (ICE): http://ice-corpora.net/ICE/INDEX.HTM


British National Corpus(BNC)

BYU-BNChttp://corpus.byu.edu/bnc/

SlopeQ BNC: http://pelcra.clarin-pl.eu/SlopeqBNC/

BNCwebhttp://bncweb.lancs.ac.uk/bncwebSignup/user/login.php

BNC Sampler:  http://cqpweb.lancs.ac.uk/bncsampler/  contact: a.hardie@lancaster.ac.uk 

                           http://www.lextutor.ca/concordancers/concord_e.html

JustTheWordhttp://www.just-the-word.com/
StringNet: http://nav4.stringnet.org/index.php

Wordneighbors http://wordneighbors.ust.hk/

Phrases in English (PIE):  http://phrasesinenglish.org/

Corpuseye: http://corp.hum.sdu.dk/cqp.en.html

BNC Simple Search: http://www.natcorp.ox.ac.uk/  


Corpus of Online Registers of English (CORE): http://corpus.byu.edu/core/

Strathy Corpus Of Canadian English: http://corpus2.byu.edu/can/

Australian National Corpus (ANC): http://www.ausnc.org.au/corpora/ausnc

Scottish Corpus Of Texts and Speech (SCOTS)http://www.scottishcorpus.ac.uk/ 

Corpus Of Modern Scottish Writing (CMSW):http://www.scottishcorpus.ac.uk/cmsw/

New Corpus for Ireland (NCI) : https://focloir.sketchengine.co.uk/run.cgi/register_form?uilang=en

Open American National Corpushttp://www.anc.org/data/oanc/ngram/

                                                            http://www.anc.org/data/oanc/download/


Corpora of the Brown Family: http://www.sketchengine.co.uk/

Brown/Lob Corpus:  完整版

http://ec-concord.ied.edu.hk/paraconc/monoconcE.htm

Brown Corpus: http://111.200.194.212/cqp/brown1/ ID: test password: test

Brown Corpus:  完整版

http://the.sketchengine.co.uk/open/corpus/brown/ske/first_form

Brown Corpus:  完整版 http://www.icorpus.net/application/conc/

Brown Corpus:  完整版 https://corpling.uis.georgetown.edu/cqp/brown/

Brown Corpus: All Brown 15 sublists

http://www.lextutor.ca/range/range_corpus/

Online Corpus Concordancer: 

http://211.86.103.26/linwei_web/application/conc/index.php

CLOB Corpus: http://111.200.194.212/cqp/clob/ name: test  password: test 

Crown Corpus: http://111.200.194.212/cqp/crown/   name: test  password: test 

SKELL: http://forbetterenglish.com/ http://skell.sketchengine.co.uk/run.cgi/skell

American English 2006 (AmE06): http://cqpweb.lancs.ac.uk/ame06/    contact: a.hardie@lancaster.ac.uk

British English 2006 (BE06): http://cqpweb.lancs.ac.uk/be2006/    contact: a.hardie@lancaster.ac.uk 


Corpus Aggregator: 

Sketch Engine:  http://www.sketchengine.co.uk/

Lextutor: http://www.lextutor.ca/conc/eng/

Skylight: http://www.skylight-to-english.co.uk/skylight/

CLARIN: http://weblicht.sfs.uni-tuebingen.de/Aggregator/#

Leeds: http://corpus.leeds.ac.uk/protected/query.html

 WebCorp: http://wse1.webcorp.org.uk/

Corpus Cloud: http://www.corpuscloud.cn/


Monco: http://monitorcorpus.com/

Corpus(dictionary): Vocabulary: https://www.vocabulary.com/dictionary/

2.   Parallel/Comparable Corpora

Chinese-English and English-Chinese

General

Xiamen University: http://www.luweixmu.com/ec-corpus/query.asp

Peking University: http://ccl.pku.edu.cn:8080/ccl_corpus/index_bi.jsp

Hong Kong Institute of Education: http://ec-concord.ied.edu.hk/paraconc/index.htm

Babelhttp://111.200.194.212/cqp/babel1/ (en-ch)ID: test  password: test

http://111.200.194.212/cqp/babel1c/ (ch-en) ID: test  password: test 

Novels 

Hong Lou Meng 红楼梦: http://corpus.usx.edu.cn/hongloumeng/index.asp 

Hong Lou Meng 红楼梦http://corpus.nie.edu.sg/hlm/index.htm#

三国演义http://corpus.usx.edu.cn/sanguo/index.asp 

西游记: http://corpus.usx.edu.cn/xiyouji/index.asp 

鲁迅小说:http://corpus.usx.edu.cn/luxun/index.asp

水浒传:http://corpus.usx.edu.cn/shuihu/index.asp

西厢记:http://corpus.usx.edu.cn/xixiangji/index.asp

Great men’s works 

邓小平文选: http://corpus.usx.edu.cn/dengxiaoping/index.asp

毛泽东选集: http://corpus.usx.edu.cn/maozedong/index.asp 

Classics 

大学:http://corpus.usx.edu.cn/daxue/index.asp

老子:http://corpus.usx.edu.cn/laozi/index.asp

易经:http://corpus.usx.edu.cn/yijing/index.asp

Law 

中国法律法规(mainland China): 

http://corpus.usx.edu.cn/lawcorpus1/index.asp 

中国法律法规(Taiwan):

 http://corpus.usx.edu.cn/lawcorpus2/index.asp 

中国法律法规(Hong Kong): 

http://corpus.usx.edu.cn/lawcorpus3/index.asp 

Hong Kong Hansard:  

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Peking University Law: http://en.pkulaw.cn/

英汉社论平行语料库:  http://icorpus.net/application/ft/                                    

Ted Speeches:Chinese-English http://111.200.194.212/cqp/tedctoe/ ID: test password: test

                         English-Chinese http://111.200.194.212/cqp/tedetoc2/ ID: test password: test                       


3. Business and Financial Corpora

Corpus of Business Correspondence:

http://langbank.engl.polyu.edu.hk/corpus-search.asp

PolyU Business Corpus: 

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Hong Kong Financial Services Corpus: 

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Learner Corpus of English for Business Communication:

http://langbank.engl.polyu.edu.hk/corpus-search.asp

SCMP Corpus of Business Reportshttp://langbank.engl.polyu.edu.hk/corpus-search.asp

Business Letter Corpus: 

http://www.someya-net.com/concordancer/

Business English Corpus:http://111.200.194.212/cqp/business/  user: test password: test

Business English Corpus: http://biz.yulk.org/

Wall Street Journal Corpus:http://bcc.blcu.edu.cn/lang/en 

4. Literary/Historical Corpora

COHAhttp://corpus.byu.edu/coha/

Hong Lou Meng Corpus: http://124.193.83.252/cqp/hlmyangs/  name: test   password: test

Online Corpus of Old English Poetry

http://www.oepoetry.ca/

CAPA (contemporary American Poetry Archive): 

http://capa.conncoll.edu/

SETIS Australian Literary and Historical Texts: 

http://setis.library.usyd.edu.au/oztexts/search.html

Corpus of Middle English Prose and Verse: http://quod.lib.umich.edu/c/cme/

Web Concordances: http://www.concordancesoftware.co.uk/webconcordances/

Chaucer: 

http://www.umm.maine.edu/faculty/necastro/chaucer/concordance/
Sherlockian: http://www.sherlockian.net/

Dickens: http://111.200.194.212/cqp/dickens/ name: test  password:test

                 https://corpling.uis.georgetown.edu/cqp/dickens/

The Complete Corpus of Anglo-Saxon Poetry: 

http://www.sacred-texts.com/neu/ascp/

Bartleby: http://www.bartleby.com/

Internet Classics Archive: http://classics.mit.edu/index.html

Concordance of Shakespeare's complete workshttp://www.opensourceshakespeare.org/concordance/

Shakespeare's Words: http://www.shakespeareswords.com/

Shakespeare's Sonnets Corpus: 

http://www.luweixmu.com/ecorpus/sonnets/framconc.asp

Alex Catalogue of Electronic Textshttp://infomotions.com/alex/

Corpus of Electronic Texts (CEIT):

 http://www.ucc.ie/celt/search.html

OED: www.oed.com  users nameCoastline  passwordOed789

Middle English Dictionary: http://quod.lib.umich.edu/m/med/

MEMEM (Michigan Early Modern English Materials):

http://www.hti.umich.edu/m/memem/

Corpus of Late Modern English Texts: https://perswww.kuleuven.be/~u0044428/

卢伟: http://www.luweixmu.com/ecorpus/index.htm
The Arabian Nights:  http://cqpweb.lancs.ac.uk/aldine1001/  contact: a.hardie@lancaster.ac.uk
                                    http://cqpweb.lancs.ac.uk/burton1001/  contact: a.hardie@lancaster.ac.uk
CLiC: http://clic.bham.ac.uk/concordances/

5. Web Corpora

Web Corp: http://wse1.webcorp.org.uk/ 

                     http://www.webcorp.org.uk/live/

Monco: http://monitorcorpus.com/ 

Google Book Corpus: http://googlebooks.byu.edu/

Google Books Ngram Viewer: http://ngrams.googlelabs.com/datasets

Leeds Corpora: http://corpus.leeds.ac.uk/internet.html


6. Learner Corpora 

Chinese learners
 Ten Thousand Compositions of Chinese Learners (TECCL): 
http://111.200.194.212/cqp/teccl/   name: test  password: test

CLEC: http://www.icorpus.net/application/conc/index.php

PolyU Learner English Corpus (PLEC):

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Corpus for Higher Education

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Learner Journals: http://langbank.engl.polyu.edu.hk/corpus-search.asp

Learner Corpus of Essays and Reports:

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Learner Corpus of English for Business Communication:

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Polish learners

PICLE Corpushttp://ifa.amu.edu.pl/~kprzemek/concord2advr/search_adv_new.html

Japanese learners

JEFLL: http://scn.jkn21.com/~jefll03/ 

Singaporean learners

An Investigation in Peer Work and Peer Talk in Singapore Primary Classrooms (PWPT):

http://corpus.nie.edu.sg/pwpt/index.htm#

French learners

Scientext English Learner Corpus: 

http://scientext.msh-alpes.fr/scientext-site-en/spip.php?article19

Hungarian learners

JPU Learner Corpus: http://www.lextutor.ca/concordancers/concord_e.html 
Korean Learners
Gachon Korean EFL Learner Corpus: https://corpling.uis.georgetown.edu/cqp/gachon/
Mixed

Michigan Corpus of Upper-level Student Papers: 

 http://search-micusp.elicorpora.info/simple/  

Michigan Corpus of Academic Spoken English (MICASE):

http://quod.lib.umich.edu/cgi/c/corpus/corpus?c=micase;page=simple

Multimodal corpus of European teen language:

http://sacodeyl.inf.um.es/sacodeyl-search2/

VOA Special English Corpus: 

http://www.manythings.org/voa/sentences.htm  


 7Speech Corpora 

Hansard(British Parliament) Corpus: http://www.hansard-corpus.org/x.asp

Corpus of American Soap Operas: http://corpus2.byu.edu/soap/ 

The speech accent archivehttp://accent.gmu.edu/

IViE Corpushttp://www.phon.ox.ac.uk/files/apps/IViE//search.php

 Hong Kong Budget Speeches Corpus:

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Hong Kong Policy Address Speeches Corpus:

http://langbank.engl.polyu.edu.hk/corpus-search.asp

CORPS: A CORpus of tagged Political Speecheshttp://hlt.fbk.eu/corps

State of the Union Address Corpus: http://www.someya-net.com/concordancer/

                                       https://corpling.uis.georgetown.edu/cqp/stateoftheunion/

Bush and Kerry Presidential Debate: https://corpling.uis.georgetown.edu/cqp/bush_kerry_debate/

Inaugural Address Corpus: https://corpling.uis.georgetown.edu/cqp/inaugural/

Corpus of Ted Speeches:  http://111.200.194.212/cqp/ted/ name: test password: test

                                           http://opus.lingfil.uu.se/bin/opuscqp.pl?corpus=TedTalks;lang=en

                                           http://yohasebe.com/tcse/

                                           http://www.apps4efl.com/tools/talk_corpus/

Parliament Debate Search: http://search.politicalmashup.nl/

American Rhetoric: http://www.americanrhetoric.com/

Hong Kong Baptist University (HKBU) Corpus  of Political Speeches:

http://digital.lib.hkbu.edu.hk/corpus/search.php

8. Academic Corpora

British Academic Written Eng Corpus (BAWE)https://the.sketchengine.co.uk/open/

Corpus of Research Articleshttp://langbank.engl.polyu.edu.hk/corpus-search.asp
Anthology Reference Corpus: https://the.sketchengine.co.uk/open/                                       

Corpus of Dissertations in Applied Linguistics:

http://langbank.engl.polyu.edu.hk/corpus-search.asp

MicroConcord Corpus Collections (Academic) 

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Springer Exemlar, a corpus of scientific literature:  http://springerexemplar.com/index.aspx

ERIC Institute of Education Sciences: http://eric.ed.gov/
Corpus of Scientific Texts:  http://scientext.msh-alpes.fr/scientext-site-en/spip.php?article1

                              name: test  password: test
Huazhong Agricultural University Corpora: http://211.69.132.28/  name: test  password: test

9. Specialized Corpora

Boston University Noun Phrase Corpus: http://npcorpus.bu.edu/

Hong Kong Engineering Corpus: http://langbank.engl.polyu.edu.hk/corpus-search.asp

Hong Kong Corpus of Surveying and Construction Engineering

http://langbank.engl.polyu.edu.hk/corpus-search.asp

PolyU Corpus of Travel and Tourism Texts (TnT):

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Corpus of Nursing and Health Science (NHS) Texts:

http://langbank.engl.polyu.edu.hk/corpus-search.asp

MicroConcord Corpus Collections(Journalistic) 

http://langbank.engl.polyu.edu.hk/corpus-search.asp

Titanichttp://www.luweixmu.com/ecorpus/

Corpus of English Language Teaching(CELT):

http://124.193.83.252/cqp/durban/  name: test  password: test

China Daily Political News  2011 Corpus: http://111.200.194.212/cqp/cd2011/ name: test  password: test 

British Law Report Corpus (BLaRC): http://www.lextutor.ca/concordancers/concord_e.html

Arguments Transcripts of the US Supreme Court: http://www.supremecourt.gov/oral_arguments/argument_transcripts.aspx
Fashion Communication Corpus (FCC): http://lamalcorpora.engl.polyu.edu.hk/cqpweb/fcc/
Login ID: cqp  Password: cqpweb
Emergency Department Communication Corpus: http://lamalcorpora.engl.polyu.edu.hk/cqpweb/ced/
Login ID: cqp  Password: cqpweb
National University of Singapore SMS Corpus: https://corpling.uis.georgetown.edu/cqp/nus_sms/
Georgetown University Multilayer Corpus:https://corpling.uis.georgetown.edu/cqp/gum/

10. Multi-Modal  Corpora

Multimodal corpus of European teen language:

http://sacodeyl.inf.um.es/sacodeyl-search2/  

Singapore Corpus of Research in Educationhttp://corpus.nie.edu.sg/score/index.htm
Scottish Corpus Of Texts and Speechhttp://www.scottishcorpus.ac.uk/

Santa Barbara Corpus of Spoken American English:

http://www.linguistics.ucsb.edu/research/santa-barbara-corpus

Multimedia News Corpus: http://www.icorpus.net/application/mc/

Australian National Corpus (AusNC): https://www.ausnc.org.au/

11. Spoken Corpora

Michigan Corpus of Academic Spoken English

http://quod.lib.umich.edu/cgi/c/corpus/corpus?c=micase;page=simple

Asian Corpus of English: http://corpus.ied.edu.hk/ace/
Saarbruecken Corpus of Spoken English: 

http://www.uni-saarland.de/lehrstuhl/engling/scose.html

Vienna-Oxford International Corpus of English (VOICE)

http://www.univie.ac.at/voice/ 

NIE Corpus of Spoken Singapore English:

http://videoweb.nie.edu.sg/phonetic/niecsse/index.htm

Hong Kong Corpus of Spoken Englishhttp://langbank.engl.polyu.edu.hk/corpus-search.asp

Santa Barbara Corpus of Spoken American English:

 http://www.linguistics.ucsb.edu/research/santa-barbara-corpus

Friends: http://124.193.83.252/cqp/friends/  name: test password: test

British Academic Spoken English Corpushttps://the.sketchengine.co.uk/open/

Corpus of LDS General Conference Talks: http://www.lds-general-conference.org/x.asp


12. Parsed and Annotated Corpora 
York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE):

13. Computer-Mediated Communication(CMC) Corpora

14. Multilingual Corpora

Corpora from the web (COW): https://webcorpora.org/

15. Bible and Quran Corpora
Quran Analysis: http://qurananalysis.com/

16.  Collocation
Collocations Search: http://collocations.ooz.ie/
  评论这张
 
阅读(20052)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2016