The gist for this pattern can be found here.
The goal
Sometimes we might have numbers in a text file formatted like this:
I’d like to create a custom column that converts this text into a currency data type, preserving the scale of the suffix in each row and preserving the currency symbol.
A solution
I say “a” solution, because I’m sure there are many others.
Here’s a snippet we can paste into the Advanced Editor.
#"Converted currency text" = Table.AddColumn(#"Previous query step","new_column_name",each
let
//convert the original text to lower case
lower = Text.Lower([currency_as_text]),,
//add as many Text.Replace as you need to remove unwanted words
//in case of many words to remove, could iterate a list of words
words_removed = Text.Replace(lower,"unknown",""),
//for text $180B, following split creates a list {"$180","b"}
//use this splitter instead of Text.End in case suffix is multiple characters
split = Splitter.SplitTextByCharacterTransition(
{"0".."9"}, // split when one of these
{"a".."z"} // changes to one of these
)(words_removed), // use the splitter function on the words_removed variable
//get the second list item created above
//e.g. "b"
//if the original value doesn't have a suffix, there's an error here, so put "nope" instead
suffix = try split{1} otherwise "nope",
//now define a record to use as a lookup
//we'll use these numbers with the Number.Power function
lookup = [b = 9, bn = 9, bns = 9,
m = 6, mn = 6, mns = 6, mm = 6,
k = 3, nope = 0],
//get the first list item (the amount)
//e.g. $180
numtext = split{0},
//convert amount to numeric (should handle currency symbol automatically)
//e.g. 180
num = Number.FromText(amt),
//multiply the number by 10 raised to the power from the lookup record created above
//e.g. 180 * 10^9 = 180000000000
new_num = num * Number.Power(10,Record.Field(lookup,suffix))
//ignore errors
in try new_num otherwise null
)
Alternatively, we can just paste lines 2 through 41 into a Custom Column dialog.
I know, it looks long, but I’ve put a lot of comments in there to make it as simple as possible to understand.
If we paste the longer code into the advanced editor, we need to:
- Change line 1 where it says “Previous query step” to match the previous step in the current query (depending where this is pasted)
- Change the “new_column_name” on line 1
- Change the column name on line 4 to match the column name in your data that has the currency stored as text
- Add any additional Text.Replace steps after line 8 in case you have unwelcome words in your column. Alternatively, you can write a small loop to iterate through a list of words
- Add any items to the lookup record on line 24. The value in each name = value pair represents a power of 10 that the numbers in that row will be raised to
After adding the code, the new column is ready for use:
You can of course shorten the code by removing the comments and combining steps:
#"Added Custom" = Table.AddColumn(#"Changed Type", "currency_as_number", each
let
split = Splitter.SplitTextByCharacterTransition(
{"0".."9"},
{"a".."z"}
)(Text.Replace(Text.Lower([currency_as_text]),"unknown","")),
lookup = [b = 9, bn = 9, bns = 9,
m = 6, mn = 6, mns = 6, mm = 6,
k = 3, nope = 0],
out_shake_it_all_about = Number.FromText(split{0}) * Number.Power(10,Record.Field(lookup,try split{1} otherwise "nope"))
in out_shake_it_all_about
)
I’ve saved the gist with all the comments so I can remember what I did later.
How it works
Looking again at the code:
#"Converted currency text" = Table.AddColumn(#"Previous query step","new_column_name",each
let
//convert the original text to lower case
lower = Text.Lower([currency_as_text]),,
//add as many Text.Replace as you need to remove unwanted words
//in case of many words to remove, could iterate a list of words
words_removed = Text.Replace(lower,"unknown",""),
//for text $180B, following split creates a list {"$180","b"}
//use this splitter instead of Text.End in case suffix is multiple characters
split = Splitter.SplitTextByCharacterTransition(
{"0".."9"}, // split when one of these
{"a".."z"} // changes to one of these
)(words_removed), // use the splitter function on the words_removed variable
//get the second list item created above
//e.g. "b"
//if the original value doesn't have a suffix, there's an error here, so put "nope" instead
suffix = try split{1} otherwise "nope",
//now define a record to use as a lookup
//we'll use these numbers with the Number.Power function
lookup = [b = 9, bn = 9, bns = 9,
m = 6, mn = 6, mns = 6, mm = 6,
k = 3, nope = 0],
//get the first list item (the amount)
//e.g. $180
numtext = split{0},
//convert amount to numeric (should handle currency symbol automatically)
//e.g. 180
num = Number.FromText(amt),
//multiply the number by 10 raised to the power from the lookup record created above
//e.g. 180 * 10^9 = 180000000000
new_num = num * Number.Power(10,Record.Field(lookup,suffix))
//ignore errors
in try new_num otherwise null
)
- Convert the original data to lower case using Text.Lower. This helps with the later steps
- Removing an unwanted word from the column using Text.Remove
- Using Splitter.SplitTextByCharacterTransition, which allows us to split each row’s value at the point where a number is followed by a letter
- Extracting the B, M etc suffix into a variable and if there is no suffix, just putting the word “nope”
- Defining a record called lookup which holds the name:value pairs for each suffix. The value is the power of 10 to raise the number by
- Extracting the number from the split list into a variable called numtext
- Converting the numtext to a number. If the currency code is all the same, this should work without error, but it might require modification in case of mixed or unrecognized symbols
- Calculating the new_num by multiplying the extracted num by the power of 10 that’s found in the lookup record
In summary
We can use a series of transformations in PowerQuery that each depend on the last to apply cleaning steps to a column.
The Splitter family of functions can be used to separate parts of text into a list of parts on each row.
The list elements are accessed by zero-based index in braces such as list{0}.
We can use a record data type as a kind of lookup dictionary. The values can be looked up using Record.Field(record,name)
I hope this has been useful for you. Writing these short posts help cement these techniques in my head!
Let me know in the comments if there are other more succinct ways to achieve this transformation.