Résumé parsing

From Wikipedia, the free encyclopedia

Resume parsing, also known as CV parsing, resume extraction, or CV extraction, allows for the automated storage and analysis of resume data. The resume is imported into parsing software and the information is extracted so that it can be sorted and searched.


Resume parsers analyze a resume, extract the desired information, and insert the information into a database with a unique entry for each candidate.[1] Once the resume has been analyzed, a recruiter can search the database for keywords and phrases and get a list of relevant candidates. Many parsers support semantic search, which adds context to the search terms and tries to understand intent in order to make the results more reliable and comprehensive.[2]

Machine learning[edit]

Machine learning is extremely important for resume parsing. Each block of information needs to be given a label and sorted into the correct category, whether that's education, work history, or contact information.[3] Rule-based parsers use a predefined set of rules to parse the text. This method does not work for resumes because the parser needs to "understand the context in which words occur and the relationship between them."[4] For example, if the word "Harvey" appears on a resume, it could be the name of an applicant, refer to the college Harvey Mudd, or reference the company Harvey & Company LLC. The abbreviation MD could mean "Medical Doctor" or "Maryland". A rule-based parser would require incredibly complex rules to account for all the ambiguity and would provide limited coverage.

This leads us to Machine Learning and specifically Natural Language Processing (NLP). NLP is a branch of Artificial Intelligence and it uses Machine Learning to understand content and context as well as make predictions.[5] Many of the features of NLP are extremely important in resume parsing. Acronym normalization and tagging accounts for the different possible formats of acronyms and normalizes them. Lemmatization reduces words to their root using a language dictionary and Stemming removes “s”, “ing”, etc. Entity extraction uses regex expressions, dictionaries, statistical analysis and complex pattern-based extraction to identify people, places, companies, phone numbers, email addresses, important phrases and more.[4]


Resume parsers have achieved up to 87% accuracy,[6] which refers to the accuracy of data entry and categorizing the data correctly. Human accuracy is typically not greater than 96%, so the resume parsers have achieved "near human accuracy."[7]

One executive recruiting company tested three resume parsers and humans to compare the accuracy in data entry. They ran 1000 resumes through the resume parsing software and had humans manually parse and enter the data. The company brought in a third party to evaluate how the humans did compared to the software. They found that the results from the resume parsers were more comprehensive and had fewer mistakes. The humans did not enter all the information on the resumes and occasionally misspelled words or wrote incorrect numbers.[8]

In a 2012 experiment, a resume for an ideal candidate was created based on the job description for a clinical scientist position. After going through the parser, one of the candidate's work experiences was completely lost due to the date being listed before the employer. The parser also didn't catch several educational degrees. The result was that the candidate received a relevance ranking of only 43%. If this had been a real candidate's resume, they wouldn't have moved on to the next step even though they were qualified for the position.[9] It would be helpful if a similar study was conducted on current resume parsers to see if there have been any improvements over the past few years.


  • A notable resume study was conducted by Marianne Bertrand and Sendhil Mullainathan in 2003. They wanted to observe the effects of White-sounding names versus Black-sounding names on resumes in the hiring process. They sent identical resumes—varying from low- to high-quality—of the same qualifications and credentials, but differed in the name of the applicants for the same job openings. One group had the stereotypical Caucasian names such as Greg and Emily, and the other group had the stereotypical African-American names such as Darnell and Tamika. Bertrand and Mullainathan then recorded how many of the applicants received callbacks for an interview. The result showed that despite the quality of the resume, the ones of white applicants elicited 50% more callbacks than their black counterparts. Therefore, the quality of the resume mattered less than the race of the applicant in the selection process. The attitudes of the hiring managers were not measured, so it is unknown whether this is a form of implicit or explicit bias. However, companies are continuing to discriminate against Black applicants and have bias built into their hiring processes.[10] Resume parsing can impede the bias that inevitably rises in the hiring process and allow applicants to be ranked based on the objective information. The software can be programmed to disregard and conceal the elements of a resume that can lead to bias (e.g. name, gender, race, age, address, etc).[11]
  • The technology is extremely cost-effective and a resource saver. Rather than asking candidates to manually enter the information, which could discourage them from applying or wasting recruiter's time, data entry is now done automatically.[12]
  • The contact information, relevant skills, work history, educational background and more specific information about the candidate is easily accessible.[12]
  • The applicant screening process is now significantly faster and more efficient. Instead of having to look at every resume, recruiters can filter them by specific characteristics, sort and search them. This allows recruiters to move through the interview process and fill positions at a faster rate.
  • One of the biggest complaints people searching for jobs have is the length of the application process. With resume parsers, the process is now faster and candidates have an improved experience.[13]
  • The technology helps prevent qualified candidates from slipping through the cracks. On average, a recruiter spends 6 seconds looking at a resume.[14] When a recruiter is looking through hundreds or thousands of them, it can be easy to miss or lose track of potential candidates.
  • Once a candidate's resume has been analyzed, their information remains in the database. If a position comes up that they are qualified for, but haven't applied to, the company still has their information and can reach out to them.


The parsing software has to rely on complex rules and statistical algorithms to correctly capture the desired information in the resumes. There are many variations of writing style, word choice, syntax, etc. and the same word can have multiple meanings. The date alone can be written hundreds of different ways.[1] It is still a challenge for these resume parsers to account for all the ambiguity. Natural Language Processing and Artificial Intelligence still have a way to go in understanding context-based information and what humans mean to convey in written language.

Resume optimization[edit]

Resume parsers have become so omnipresent that rather than writing to a recruiter, candidates should focus on writing to the parsing system. Understanding how they work is a great first step, but there are also specific changes an applicant can make to optimize their resume. Here are some tips on how to do that:

  1. Use keywords from the job description in relevant places on your resume. These keywords will almost certainly be included in the parsing process.[12]
  2. Don't use headers or footers. They tend to confuse the parsing algorithms.[15]
  3. Use a simple style for fonts, layouts and formatting.[15]
  4. Avoid graphics.[15]
  5. Use standard section names such as “Work Experience” and “Education”.[3]
  6. Avoid using acronyms unless they're included in the job description. The safest option may be to write the long form and include the acronym after in parentheses.[3]
  7. Don't start with dates in the "Work Experience" section. Parsers typically look for dates following job titles or company names.[3]
  8. Stay consistent with formatting past work experience. The standard is job title, company title, and then employment dates.[9]
  9. Most resume parsers claim to work with all of the main file types, but stick with docx, doc and pdf to be on the safe side.[3]

Software and vendors[edit]

There are many stand-alone options for resume parsers including [16]RChilli, Skillate, CandidateZip, Sovren, Daxtra, Textkernel, Hireability and they are also typically bundled in with Applicant Tracking Systems, which are used by companies to streamline the hiring process. 90% of Fortune 500 companies use Applicant Tracking Systems and they can do everything from processing job applications, managing the recruiting process and executing the hiring decision.[17]

With recent advancements in AI sophistication and Machine Learning, and the text mining and analysis processes improvements, which ensure up to 95% accuracy [18] in the data processing, many AI technologies [19] have sprung up to help the job seekers in the creation of application documents. These services focus on creating ATS-friendly resumes, execute resume check and screening, and help with all of the preparation and application processes. Some of the AI builders, such as Leap.ai and Skillroads, concentrate on the resume creation while others, like Stella, also offer help with the job hunt itself as they match candidates to appropriate vacancies. In 2017, Google made an attempt at dismantling the US$215.68 Bn (as of 2017) global recruitment market via the creation of Google for Jobs, which is predicted to greatly affect the labor market. This expansion to the search engine uses Cloud Talent Solution,[20] Google's own invention, which is another iteration of the smart AI resume builder and matching system.


Resume parsers are already standard in most mid- to large-sized companies and this trend will continue as the parsers become even more affordable.[12]

A qualified candidate's resume can be ignored if it is not formatted the proper way or doesn't contain specific keywords or phrases. As Machine Learning and Natural Language Processing get better, so will the accuracy of resume parsers.

One of the areas resume parsing software is working on expanding into is performing contextual analysis on the information in the resume rather than purely extracting it. One employee at a parsing company said “a parser needs to classify data, enrich it with knowledge from other sources, normalize data so it can be used for analysis and allow for better searching.” [21]

Parsing companies are also being asked to expand beyond just resumes or even LinkedIn profiles. They are working on extracting information from industry-specific sites such as GitHub and social media profiles.[21]  


  1. ^ a b “What Is CV/Resume Parsing?” DaXtra, Daxtra Technologies Ltd, 18 Oct. 2016, www.daxtra.com/2016/10/18/what-is-cvresume-parsing/.
  2. ^ Ratcliff, Christopher. “Search Engine Watch.” What Is Semantic Search and Why Does It Matter?, ClickZ Group Limited, 21 Oct. 2015, searchenginewatch.com/sew/opinion/2431292/what-is-semantic-search-and-why-does-it-matter.
  3. ^ a b c d e “Is Your Resume Ready for Automated Screening?” Resume Hacking, Resume Hacking, 2 Jan. 2016, www.resumehacking.com/ready-for-automated-resume-screening.
  4. ^ a b Nelson, Paul. "Natural Language Processing (NLP) Techniques for Extracting Information." Search Technologies, Search Technologies, www.searchtechnologies.com/blog/natural-language-processing-techniques.
  5. ^ Reynolds, Brandon. “The Terrible Trouble with Natural Language Processing (It's Us.).”Salesforce Blog, Salesforce.com, Inc., 17 Aug. 2016, www.salesforce.com/blog/2016/08/trouble-with-natural-language-processing.html.
  6. ^ "HR software companies? Why structuring your data is crucial for your business?". 15 April 2019.
  7. ^ “Types of Parsers and How They Work.” Daxtra, Daxtra Technologies Ltd, 26 Feb. 2014, www.daxtra.com/2014/02/26/types-of-parser-and-how-they-work/.
  8. ^ "A Top Executive Recruiter Puts Accuracy to the Ultimate Test." Resume Parsing: Putting Accuracy to the Ultimate Test, Sovren Group, Inc., www.sovren.com/resource-center/a-top-executive-recruiter-puts-accuracy-to-the-ultimate-test/.
  9. ^ a b Levinson, Meridith. “5 Insider Secrets for Beating Applicant Tracking Systems (ATS).”CIO, CIO, 1 Mar. 2012, www.cio.com/article/2398753/careers-staffing/careers-staffing-5-insider-secrets-for-beating-applicant-tracking-systems.html.
  10. ^ Bertrand, Marianne; Mullainathan, Sendhil (July 2003). "Are Emily and Greg More Employable than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination". National Bureau of Economic Research. 9873. doi:10.3386/w9873.
  11. ^ “3 Ways Recruiters Can Use AI to Reduce Unconscious Bias.” Undercover Recruiter, 12 May 2017, theundercoverrecruiter.com/ai-reduce-unconscious-bias/.
  12. ^ a b c d “Baby Steps in HR Technology: What Is Resume Parsing?” Recruiterbox, Recruiterbox Inc, 12 Oct. 2017, recruiterbox.com/blog/baby-steps-in-hr-technology-what-is-resume-parsing-2/.
  13. ^ Cain, Áine. “The Real Reason 60% of Job Seekers Can't Stand the Application Process.” Business Insider, Business Insider, 16 June 2016, www.businessinsider.com/why-most-ob-seekers-cant-stand-the-application-process-2016-6.
  14. ^ Schultz, Carol. “Got a Minute? If So, Spend It Looking at Resumes.” ERE, ERE Media., 3 May 2012, www.ere.net/got-a-minute-if-so-spend-it-looking-at-resumes/.
  15. ^ a b c Cappelli, Peter. “How to Get a Job? Beat the Machines.” Time, Time Inc., 11 June 2012, business.time.com/2012/06/11/how-to-get-a-job-beat-the-machines/.
  16. ^ "What is the best resume parsing software?".
  17. ^ Hu, James. “Your Top 7 Questions About Applicant Tracking Systems, Answered.”Recruiter, Recruiter.com, Inc., 16 Aug. 2017, www.recruiter.com/i/your-top-7-questions-about-applicant-tracking-systems-answered/.
  18. ^ "up to 95% accuracy". Towards Data Science. 17 January 2018.
  19. ^ "AI technologies that help you to get hired". Skillroads.
  20. ^ "Cloud Talent Solution". Google.
  21. ^ a b Zielinkski, Dave. “Does Your Resume Parser Stack Up? How to Evaluate Next-Generation Systems.” SHRM Society for Human Resource Management, SHRM, 10 May 2016, www.shrm.org/resourcesandtools/hr-topics/technology/pages/does-your-resume-parser-stack-up-how-to-evaluate-next-generation-systems.aspx?sthash.2dz2wgkl.mjjo.