• Skip to content
  • Skip to footer
  • Accessibility options
University of Brighton
  • About us
  • Business and
    employers
  • Alumni and
    supporters
  • For
    students
  • For
    staff
  • Accessibility
    options
Open menu
Home
Home
  • Close
  • Study
    • Courses and subjects
    • Find a course
    • A-Z course list
    • Explore our subjects
    • Academic departments
    • Visiting the university
    • Explore: get to know us
    • Upcoming events
    • Virtual tours
    • Chat to our students and staff
    • Open days
    • Applicant days
    • Order a prospectus
    • Ask a question
    • Studying here
    • Accommodation and locations
    • Applying
    • Undergraduate
    • Postgraduate
    • Transferring from another university
    • The Student Contract
    • Clearing 2022
    • International students
    • Fees and finance
    • Advice and help
    • Advice for students
    • Advice for parents and carers
    • Advice for schools and teachers
    • Managing your application
    • Undergraduate
    • Postgraduate
    • Apprenticeships
  • Research
    • Research and knowledge exchange
    • Research and knowledge exchange organisation
    • The Global Challenges
    • Centres of Research Excellence (COREs)
    • Research Excellence Groups (REGs)
    • Our research database
    • Information for business
    • Community University Partnership Programme (CUPP)
    • Postgraduate research degrees
    • PhD research disciplines and programmes
    • PhD funding opportunities and studentships
    • How to apply for your PhD
    • Research environment
    • Investing in research careers
    • Strategic plan
    • Research concordat
    • News, events, publications and films
    • Research and knowledge exchange news
    • Inaugural lectures
    • Research and knowledge exchange publications and films
    • Academic staff search
  • About us
  • Business and employers
  • Alumni, supporters and giving
  • Current students
  • Staff
  • Accessibility
Search our site

A multi-coloured mix of molecules, columns of numbers and radio waves, depicting modern communications and research into internet security.

Centre for Secure, Intelligent and Usable Systems
  • What we do
  • Join us for study, work or visit
  • Who we are
  • What we do
    • What we do
    • Security
    • Intelligence
    • Usability
    • Our research and enterprise projects
    • Our research and enterprise impact
  • Our research and enterprise projects
    • Our research and enterprise projects
    • EMPOWERCARE
    • 3D-COFORM
    • Accessible Reasoning with Diagrams
    • Adaptation of mobile and distributed systems
    • Automatic semantic analysis of 3D content in digital repositories
    • ChartEx
    • DEFeND Data governance for supporting GDPR
    • Engineering and Evolving Secure Software Systems
    • EPOCH
    • Hove Plinth - Place-based narratives
    • iV&L Net
    • Mesh saliency
    • MITIGATE
    • Real-time automatic label detection on food trays
    • Real-time interactive image segmentation
    • Secure Tropos
    • SESAME
    • The VisiOn project
    • Word sketches
  • Word sketches

Sketch Engine

To produce a dictionary, you need a large collection of language, which will tell you how a word is used, how often it appears and where. Dictionary-makers take advantage of a large repository of sentences and literature known as a Corpus, which contains millions of words. For example, the British National Corpus offers 100 million words drawn from literature and spoken conversation, providing a valuable idea of how contemporary language is used.

Starting in the 1990s with research into enhancement of online lexical resources, Dr Roger Evans and Dr Adam Kilgarriff developed a new approach to lexicography using computer-based statistical analysis of the behaviour of individual words in large bodies of text online.

Revolutionising language use - article

Project aims

When the research began the researchers started off thinking about how you build resources that could be used for computerised language processing systems. In order to achieve this, a lot of information about how words behave was needed, so they started looking at dictionaries. What they found was more interesting and challenging so the project evolved into supporting the dictionary-making process rather than drawing from existing dictionaries. There was a definite progression from building computer systems to creating tools for production.

The key innovation of the project was a new method for creating word sense profiles, or word sketches, capturing the detailed behaviour of  individual words from large collections of text. Using these word sketches, they created a computational lexicography tool, which was commercialised as the ‘Sketch Engine’ by Lexical Computing Ltd, a company set up in 2003 by Kilgarriff and Pavel Rychlý, a researcher in text processing tools at Masaryk University in Brno, Czech Republic, at a time when dictionary publishers were beginning to look at moving online.

Ultimately, the Sketch Engine allows us to create from the Collins Corpus a true picture of language as it is currently used and gives us empirical evidence on which to base our content. This allows us to claim with confidence that our language reference products are based on language as it is really used and so are the most authoritative available.

David Wark, Senior Publishing Systems and Data Developer, HarperCollins

Project findings and impact

The Sketch Engine has been adopted by four of the UK’s five major dictionary publishers. Lexical Computing Ltd is working with Oxford University Press to analyse children’s language and Cambridge University Press to analyse the language produced by learners of English. National language institutes in nine European countries and 200 universities worldwide use it to support language research, dictionary production, language technology products and to enable language teaching. It has allowed users to access information on between 30 million and 70 billion words in 61 different languages. Lexical Computing Ltd now employs staff in the UK and the Czech Republic, along with freelancers in a number of other countries. Half of the company’s business is overseas and it runs training courses around the world.

The Sketch Engine has also been used to substantiate arguments in a pervasive debate about language use in the art world. A 2010 analysis of exhibition announcements, which utilised the Sketch Engine’s search tool, was published in the US art journal Triple Canopy and sparked an international debate on the language of art. This journal article has since become a widely circulated piece of online cultural criticism, sparking further debates on other forums, including Wordpress, Tumblr, Google+, Ikono, Artblog and Artsia.

 

Impact-report-cover

Read the Sketch Engine article in UK Computer Science Impact Report

Research team

Dr Roger Evans

Dr Adam Kilgarriff

Dr David Tugwell

Output

Sketch Engine article in The Computer Science Impact Report.

The UK Computer Science Impact Report

Partners

Macmillan Publishers Ltd

Lexical Computing Ltd

Back to top
  • Facebook
  • Twitter
  • Instagram
  • TikTok
  • YouTube
  • LinkedIn icon

Contact us

University of Brighton
Mithras House
Lewes Road
Brighton
BN2 4AT

Main switchboard 01273 600900

Course enquiries

Sign up for updates

University contacts

Report a problem with this page

Quick links Quick links

  • Courses
  • Open days
  • Order a prospectus
  • Academic departments
  • Academic staff
  • Professional services departments
  • Jobs
  • Privacy and cookie policy
  • Accessibility statement
  • Libraries
  • Term dates
  • Maps
  • Graduation
  • Site information
  • Online shop
  • COVID-19

Information for Information for

  • Current students
  • International students
  • Media/press
  • Careers advisers/teachers
  • Parents/carers
  • Business/employers
  • Alumni/supporters
  • Suppliers
  • Local residents