Compare sheets in excel for matches. Comparison of data in Excel on different sheets. How to compare Excel files

If other users have the right to edit your book, then after opening it you may have questions "Who changed it? And what exactly changed?" The Microsoft Spreadsheet Comparison Tool can help you answer these questions by finding the changes and highlighting them.

Important: Spreadsheet Compare is only available with Office versions ProPlus 2013 and Office 365 ProPlus.

The comparison results are displayed in a two-part table. The book on the left corresponds to the file specified in the "Compare" field, and the book on the right corresponds to the file specified in the "To" field. Details are displayed in the area below the two table sections. Changes are highlighted in different colors according to their type.

Interpretation of results

Other ways to work with comparison results

If you want to save the results or analyze them in another application, export them to an Excel file or copy and paste them into another program, for example Microsoft Word... You can also get a more accurate representation of each sheet, displaying cell formatting similar to what you see in Excel.

    You can export the results to an Excel file that is easier to read. Please select Home> Export Results(Home> Export Results).

    To copy the results and paste them into another program, select Home> Copy Results to Clipboard(Home> Copy Results to Clipboard).

    To display the formatting of cells from the workbook, select Home> Show Workbook Colors(Home> Show Book Colors).

Other reasons for comparing books

    Let's say your organization is pending an audit. You need to track data in important books that show changes by month and year. This will help you find and fix bugs before reviewers get to them.

    The spreadsheet compare tool can be used not only to compare the contents of sheets, but also to find differences in code Visual basic for applications (VBA). The results are displayed in a window so that the differences can be viewed in parallel.

Sometimes it becomes necessary to compare two MS Excel files. It can be finding discrepancies in prices for certain positions or changing any readings, it is not important, the main thing is that it is necessary to find certain discrepancies.

It will not be superfluous to mention that if there are a couple of records in the MS Excel file, then there is no point in resorting to automation. If the file contains several hundred, or even thousands of records, then one cannot do without the help of the computing power of a computer.

Let's simulate a situation when two files have the same number of lines, and the discrepancy must be searched for in a specific column or in several columns. Such a situation is possible, for example, if it is necessary to compare the price of goods according to two price lists, or to compare the measurements of athletes before and after the training season, although there should be a lot of them for such automation.

As a working example, let's take a file with the performance of fictional participants: running 100 meters, running 3000 meters, and pull-ups. The first file is the measurement at the beginning of the season, and the second is the end of the season.

The first way to solve the problem. Solution only by means of MS Excel formulas.

Since the records are lined up vertically (the most logical structure), you need to use the function. In case of using horizontal arrangement of records, you will have to use the function.

To compare the performance of running 100 meters, the formula looks like in the following way:
= IF (VLOOKUP ($ B2; Sheet2! $ B $ 2: $ F $ 13; 3; TRUE)<>D2; D2-VLOOKUP ($ B2; Sheet2! $ B $ 2: $ F $ 13; 3; TRUE); "No difference")
If there is no difference, a message is displayed that there is no difference, if it is present, then the indicator of the beginning of the season is subtracted from the value at the end of the season.

The formula for running 3000 meters is as follows:
= IF (VLOOKUP ($ B2; Sheet2! $ B $ 2: $ F $ 13; 4; TRUE)<>E2; "There is a difference"; "There is no difference")
If the final and initial value not equal, the corresponding message is displayed. The formula for pull-ups can be similar to any of the previous ones, it makes no sense to give it additionally. The final file with the found discrepancies is shown below.

A little explanation. For ease of readability of the formulas, the data from the two files was moved to one (on different sheets), but this could not have been done.

Video of comparing two MS Excel files using functions and.

The second way to solve the problem. Solution with MS Access.

The task can be solved by first importing MS Excel files into Access. As for the method of importing external data itself, there is no difference to find differing fields (any of the options presented will do).

The latter represents a bond Excel files and Access, therefore, when changing data in Excel files, discrepancies will be found automatically when you run a query in MS Access.

The next step after the import is to create relationships between the tables. As a linking field, select the unique field "p / p".
The third step is to create a simple select query using the Query Builder.

In the first column we indicate which records should be displayed, and in the second - under what conditions the records will be displayed. Naturally, for the second and third fields, the actions will be similar.

Video comparing MS files to Excel using MS Access.

As a result of the performed manipulations, all records were displayed, with different data in the field: "Running for 100 meters." The MS Access file is presented below (unfortunately, SkyDrive does not allow embedding as an Excel file)

There are two ways to find discrepancies in MS Excel tables. Each has both advantages and disadvantages. It is clear that this is not an exhaustive comparison list of two Excel files. We are waiting for your suggestions in the comments.

Seemingly simple task- comparison of tables. More precisely, a comparison of two columns of a table for coincidences or differences. It is logical to suggest that Excel is an ideal solution to the problem, but alas, I did not find a simple free comparison of tables in Excel, except perhaps for the primitive "row1 = row2". In reality, some processing of strings is necessary before comparison, since they may contain extra spaces, punctuation marks, and so on. As a result, it was decided to write a utility comparing two text files line by line and processing lines at the user's choice ...

Processing strings through files was chosen as universal. It doesn't matter what data source, whether it's just a list or Excel spreadsheet... Usually everything can be copied to a text file. So let's start directly with the program.

Download and unzip the program. Initially, it contains three files "Compare.exe" - the program itself. "List 1.txt" and "List 2.txt" - empty text files... They are where you need to insert your strings for comparison. Launch:

By default, in my opinion, they are optimal settings comparisons. A window with an example of how table comparisons will work was created only to adjust the settings for your tasks and a general understanding of what is happening. Do not compare the actual data in the example, since these windows contain no more than 32KB of text, the rest is cut off without warning. You may get wrong results! The program has a hint and when you hover the mouse, the window shows short description setting or item.

After you play enough with the comparison example - copy your data for comparison into the files "List 1.txt" and "List 2.txt" and with the previously selected settings, click the "Process files" button. During the processing of files, the button “Processing in progress” appears in red, wait until the end of this process. At the end, look at the place where you started the program, in the folder with the program, depending on the settings, the files indicated in the comparison example appear. With each new comparison or opening / closing the program, all files except "List 1.txt" and "List 2.txt" are deleted.

And a little about the speed of comparison. Most real-world problems are solved almost instantly. Well, my tests are like this (processor like Intel Core for socket LGA 775 with a frequency of 2 GHz):

Comparison of 2 lists of 1MB each (25 characters per line and 39 thousand lines in each list), for comparison it is obviously necessary to compare each line of the 1st list with all the lines of the other. This gives us a total of 1.521 billion string comparisons. Execution time is about 20 seconds. Memory consumption less than 10MB.

Comparison of 2 lists of 10MB each (25 characters per line and 390 thousand lines in each list). This gives us 152.1 billion string comparisons. Lead time is about an hour. In this case, the program grabs itself about 200MB random access memory... Although such sizes are already the territory of databases. In this program, I have already used all the reasonable ways to increase speed.

Algorithm of work and comparison parameters

The algorithm of work is such that with any settings, all characters are removed from the lines, except for Latin and Russian letters, numbers and signs of a period and comma. Of course, all extra spaces between words and spaces around the edges of the line are removed.

Search for matching lines and Search for differing lines - everything is clear here, whether the search will be for matching lines or differing ones. Matches will be written to the "Matches.txt" file. When searching for different lines, they will be written to two files "Mismatches 1.txt" and "Mismatches 2.txt" for Lists 1 and 2, respectively. Also, in the comparison example area, instead of one window, two appear.

ATTENTION! When comparing lists for matches, there is a peculiarity, since the same lines are present in both lists, the result is lines from list 1. Place more neatly formatted text in list 1 when comparing for matches.

Fix keyboard layout errors- before Punto switcher here, of course, far away. This refers to an error in typing similar letters in the wrong layout (C, H, P, etc.). For example Russian "s" and Latin. They are on the same key, and if the word begins with "c", then you can type the first letter in the English layout, and then switch to Russian, or vice versa. The replacement algorithm is such that if there are more Russian letters in a word than English ones, then English letters are changed to Russian ones and vice versa.

Correct Yo (e) to E (e)- just all "e" are replaced by "e".

Compare case insensitive- all letters become large.

Compare by unique strings- If the parameter is enabled, the lists are first checked for matching strings. If a string is repeated, for example, 5 times, then one copy of this string remains in the list for comparison, and the 4th is sent to the list of "repeats". Repetitions for each word list are different.

Without this parameter, the strings are compared in pairs. For example, when comparing for matching lines, if the 1st list contains 2 identical lines, and the 2nd list contains 3 more such lines, then the result will be only two lines, since The 3rd line did not find a pair with which it would match. If for this example we switch comparisons to non-matching strings, then one row from list 2-a will be included in the result of non-matching strings, since it does not match with anything.

Using unique string comparison, you can find duplicate strings in the list. To do this, you can, for example, fill in lines only the "List 1.txt" file and compare with empty file"List 2.txt" and then in the file "RepeatСп1.txt" there will be pandering lines from list 1.

Repeats only for rows from the result- works only in conjunction with comparison by unique strings. Without this parameter, all duplicate lines are included in repeat lists. If enabled, then only the lines present in the result will be included in the repetitions. The number of lines caught in repetitions is the same and is equal to the number of repetitions in the initial list minus 1.

Remove periods and commas and Remove all spaces- just deleted and that's it.

After installing the add-on, you will have new inset with the command to call the function. When you click on the command Comparing ranges a dialog box for entering parameters appears.

This macro allows you to compare tables of any size and with any number of columns. Comparison of tables can be done one, two or three columns at the same time.

The dialog box is divided into two parts: left for the first table and right for the second.

To compare tables, follow these steps:

  • Specify table ranges.
  • Set a checkbox (checkbox / checkbox) under the selected range of tables if the table includes a header (header row).
  • Select the columns of the left and right tables by which the comparison will be carried out (if the ranges of tables do not include headers, the columns will be numbered).
  • Specify the type of comparison.
  • Select the option for displaying the results.

Comparison type of tables

The program allows you to select several types of table comparison:

Find rows in one table that are missing in another table

When choosing of this type comparison program searches for rows from one table that are not present in another. If you compare tables by several columns, then the result of the work will be rows in which there is a difference in at least one of the columns.

Find matching lines

When choosing this type of comparison, the program finds rows that match in the first and second tables. Rows are considered coincident if the values ​​in the selected comparison columns (1, 2, 3) of one table completely coincide with the values ​​of the columns of the second table.

An example of the program's work in this mode is shown on the right in the picture.

Map tables based on selected

In this comparison mode, opposite each row of the first table (selected as the main one), the data of the matching row of the second table is copied. If there are no matching rows, the row opposite the main table remains empty.

Comparing tables on four or more columns

If you lack the functionality of the program and need to match tables in four or more columns, then you can get out of the situation as follows:

  • Create an empty column in tables.
  • On new columns using the formula = COUPLING merge the columns you want to compare.

This way you will end up with 1 column containing the values ​​of multiple columns. Well, you already know how to match one column.

Let's compare two tables with practically the same structure. Tables differ in values ​​in separate rows, some row names appear in one table, but may not be present in another.

Let on sheets January and February there are two tables with turnovers for the period for the corresponding accounts.

As you can see from the figures, the tables differ:

  1. The presence (absence) of lines (names of accounts). For example, in a table on a sheet January there is no score 26 (see the example file), and in the table on the sheet February account 10 and its sub-accounts are missing.
  2. Different values ​​in the lines. For example, on account 57, the turnovers for January and February do not coincide.

If the structures of the tables are approximately the same (most of the names of the accounts (rows) are the same, the number and names of the columns are the same), then you can compare the two tables. Let's make the comparison in two ways: one is easier to implement, the other is clearer.

A simple option for comparing 2 tables

First, let's determine which rows (account names) are present in one table, but absent in the other. Then, in a table with fewer rows missing (in the most complete table), we display a comparison report representing the difference in columns (the difference in turnovers for January and February).

The main disadvantage of this approach is that the table comparison report does not include rows that are missing from the most complete table. For example, in our case, the most complete table is a table on a sheet January, which is missing the score 26 from the February table.

To determine which of the two tables is the most complete, you need to answer 2 questions: What accounts in the February table are missing in the January? and What accounts in the January table are missing from the January one?

This can be done using the formulas (see column E): = IF (UND (VLOOKUP (A7; January! $ A $ 7: $ A $ 81; 1; 0)); "No"; "Yes") and = IF (UND (VLOOKUP (A7; February! $ A $ 7: $ A $ 77; 1; 0)); "No"; "Yes")

Comparison of turnovers by accounts will be performed using the formulas: = IF (UND (VLOOKUP ($ A7; February! $ A $ 7: $ C77; 2; 0)); 0; VLOOKUP ($ A7; February! $ A $ 7: $ C77; 2; 0)) - B7 and = IF (UND (VLOOKUP ($ A7; February! $ A $ 7: $ C77; 3,0)); 0; VLOOKUP ($ A7; February! $ A $ 7: $ C77; 3,0)) - C7

If there is no corresponding row, the VLOOKUP () function returns the # N / A error, which is processed by a combination of the UND () and IF () functions, replacing the error with 0 (if there is no row) or with a value from the corresponding column.

Use to highlight discrepancies (for example, in red).

A more visual option for comparing 2 tables (but more complex)

By analogy with the problem solved in the article, you can create a list of account names, including ALL account names from both tables (without repetitions). Then print the difference column by column.

This requires:

  1. With = IFERROR (IFERROR (INDEX (January; SEARCH (0; COUNTIF (A $ 4: $ A4; January); 0)); INDEX (February; SEARCH (0; COUNTIF (A $ 4: $ A4; February); 0))) ; "") form in column A a list of accounts from both tables (without repetitions);
  2. With = IFERROR (INDEX (List; SEARCH (SMALL (COUNTIF (List; "<"&Список); СТРОКА()-СТРОКА($B$4)); СЧЁТЕСЛИ(Список; "<"&Список); 0));"") where List is