Ok, so I have a column with names of databases various state affiliates use. Like so:
Filemaker
Access, Google Docs
Access, PT Avenue
Salesforce CRM
Microsoft SQL Server
iMIS
PT Avenue, Excel
Access, PT Avenue
Private Database
Google Docs, Constant Contact
Google Docs, Filemaker
Excel
Google Docs
Excel, MemberPlanetI want to produce the most commonly used database. The trick is that some of these states use multiple databases, which are listed in the same cell. I tried using this formula:
=INDEX(B5:F5,MODE(MATCH(B5:F5,B5:F5,0)))Using that formula would produce the answer "Access, PT Avenue" when the real answer I want is "Google Docs". Assuming multiple databases will always be separated with a comma, is there a way to return just the single most occurring database (that's called a substring, right?)?
22 Answers
Here's a solution similar to what Máté Juhász suggested in a comment. It's a few steps, but they're simple and don't require complicated formulas; Excel does all the heavy lifting.
- Split up the data. Since the entries are comma delimited, just highlight the data and use Text to Columns, based on commas. That gave me columns A and B, below. Note that Text to Columns overwrites the source values, so if you need to preserve them, just work with a copy/paste version of the data.
There's a space after each comma that needs to be removed so that the names aren't different due to a leading space. In C3, I used:
=TRIM(B3)and copied that down the column. I added "Database" as a column label in A1 and C1, which are the source data for the next step.
I used a pivot table on each of the A and C ranges, and stuck them one under the other in column A. It's some quick drag and drop, and few few preference settings:
I combined the results in column D with Copy and Paste-Special Values (a simple Copy/Paste will work, but this eliminated the borders). That produces one list with the counts from each column.
I used that to produce a new pivot table to consolidate the list with SUM. That's all of the database counts in a single list.
If the list is short, you can eyeball it to see the result (the pivot table sorts the names alphabetically). If there's a long list and a chance that there could be a tie for the maximum, you could copy and past the result, then sort it. I pulled the result shown under the pivot table with this formula (which will provide just the first max database if there's a tie):
=INDEX(G21:G31,MATCH(MAX(H21:H31),H21:H31,0))This finds the MAX value in the SUM column, uses MATCH to identify the row, and uses that result to select the database name from the left column.
How it works:
I'm assuming that you have data in A2:A15.
- Select Data and perform
TEXT to COLUMNto split them usingCOMMAasdelimiter. - Write this formula in
B2& fill down.
=COUNTIF($A$2:$A2,A2)
- Enter this Reverse Lookup formula to get the result.
=INDEX($A$2:$A$15,MATCH(3,$B$2:$B$15,0))
N.B.
- Logically Text Access & Google Docs both has same occurrence, but since Access comes first so the Formula pick it.
Adjust cell references in the Formula as needed.
5