There are times when you will need to pull words out of a string, and this page will show a couple of methods, a couple of quick and simple ways of getting a word from s string, or breaking up a short string into its component words, and also a full accurate method to cope with longer (and more words) strings.
Be aware that these hints cover only formula solutions, whereas the Data/Text To Columns command, or the Flash Fill feature, will complete these, or similar requirements, very easily.
The simple methods shown above to get first name (forename) and last name (surname), are fine when names are simple (one word for each).
To get the first name, a formula like this will do:
=LEFT(A2,FIND(" ",A2)-1)
and for last name, this is one way ... which returns the rest of the string after the first name:
=MID(A2,FIND(" ",A2)+1,999)
... or to return the last word/name of string with any number of words/names, use this:
=TRIM(RIGHT(SUBSTITUTE(A2," ",REPT(" ",100)),100))
Examples can be found in the download workbook on the home page. (file Extract names simple.xlsx)
The short quick-and-easy method, pictured above, does not scale well. In other words, it is fine for fetching words from a string that has only a few words, but not so good if the string has dozens of words.
If we take an example, for a cell, C4, to return something from an initial string (say, stored in A4), fetching the second word (as would be specified by a number 2 in referenced cell C1), we would use this formula:
=TRIM(MID(SUBSTITUTE($A4,",",REPT(" ",100)),MAX(1,(C$1-1)*100-50),100))
... a full explanation, together with an example, can be found in the download file on the home page. (workbook Extract words from string quick formula.xlsx)
Supposing we have a phrase, of many many words, and we want to pick out each word individually ... it would be best to arrange the word numbers across a row, a number in each column, going from 1 to however many words are required.
Also, any phrase that contains punctuation would need to be cleaned up, using the SUBSTITUTE function to remove things like "." (full stop) "," (comma) ";" (semi-colon) "!" (exclamation mark) "?" (question mark) "/" (slash) " (double-quote) "-" (dash/hyphen) and "(" or ")" (brackets).
To fully examine the solution, download the workbook from the home page (filename Extract words from a phrase.xlsx)
If we create a reference cell for each string, one that counts how many words are in the string, then it will be easy to code the formula to stop processing once all required words have been found. In the formula, as shown here:
=IF(D$2>$C3,"",MID($B3,IFERROR(FIND("^",SUBSTITUTE($B3," ","^",D$2-1)),0)+1,IFERROR(FIND("^",SUBSTITUTE($B3," ","^",D$2)),LEN($B3)+1)-IFERROR(FIND("^",SUBSTITUTE($B3," ","^",D$2-1)),0)-1))
... the initial clause of the IF function simply tests for the case that all words have already been processed, by comparing the number of words reference against the word number in each column.
The graphic shows pseudo-code of the formula ... translating the functions and operations directly into English words. All well and good, but maybe a little difficult to understand. So here is a narrative explanation of how the formula works:
This formula first tests to see if we need to extract the word (i.e. is the required word number higher than the actual number of words), and if not, just return nothing (a null string: ""). Otherwise we extract the required word, by using MID from x to y, where x is the beginning of the word, calculated by pointing at the space before the required word (i.e. use SUBSTITUTE to change that character to "^", and use FIND to find it, and add 1), and where y is calculated by finding the position of the space after the word, and subtracting x, as it were.
The processing of names in order to extract the surname can be complex in real life. I have provided a set of formulas and VBA that can aid in such processing. These are in an example workbook named Get last word or words methods MACROS REMOVED.xlsx - which can be downloaded here. Note that the User Defined Functions VBA has been removed for safety reasons, but the text of the functions is in a text box in the worksheet, and so can be recreated easily. Also note that a couple of the methods use functions only available in Excel 2019 and above, and will show a NAME error when used in earlier versions of Excel.
King Of Excel
Copyright © 2021 Glenn Bamford - All Rights Reserved.
Powered by GoDaddy Website Builder