SQL select only rows with max value on a column duplicate Ask Question. Find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Don't these answers return every row in each group that has a compare value equal to the maximum value? For instance, suppose there was a. SQL is made for working with data, and the data is already there in the database. There is absolutely ZERO advantage to moving it outside the database to find duplicates (as the other answers have shown). Again, use the proper tools for the job, and your answer does not. If you were talking about a 'script' meaning a stored procedure, that.
As per microsoft documentation on UniqueIdentifier, This value is always a unique globally beacuse it's based on network clock and CPU clock time and on the other hand the same documentation says
uniqueidentifier columns may contain multiple occurrences of an individual uniqueidentifier value, unless the UNIQUE or PRIMARY KEY constraints are also specified for the column.
I'm not able to come to a conclusion how UniqueIdentifier (GUIDs) can be unique globally, as Network address (Mac address) can be same on two different networks, How GUIDs can be unique globally with which combinations and also why Microsoft says there should be primary or unique constraint in order to ensure we've unique UniqueIdeitifer value always.
Viswanathan IyerViswanathan Iyer
2 Answers
Problem #1
As per Microsoft documentation on UniqueIdentifier, This value is always a unique globally because it's based on network clock and CPU clock time... (emphasis added)
The main problem here is that you are confusing two different things as being two terms that refer to one thing: UNIQUEIDENTIFIER and GUIDs.
UNIQUEIDENTIFIER
is a datatype. Datatypes define the nature of the data that they (i.e. columns and variables of this type) can contain (e.g. min / max values, etc) and certain behaviors of the data (e.g. how to handle comparisons). This particular datatype merely holds GUID / UUID values. But it is not data, so the concept of uniqueness does not apply to it. And the word 'Unique' in the name 'UniqueIdentifier' is not a promise, or even statement, regarding actual uniqueness.- GUIDs / UUIDs are actual values that can be stored as
UNIQUEIDENTIFIER
, but could also be stored asVARBINARY
/BINARY
,(N)VARCHAR
/(N)CHAR
, and maybe some others. While theUNIQUEIDENTIFIER
datatype is the best choice (in SQL Server) for storing these values, storing the values in the other types does not make the values any more or less unique.
Problem #2
I'm not able to come to a conclusion how UniqueIdentifier (GUIDs) can be unique globally, as Network address (Mac address) can be same on two different networks
The second problem here is that you are accepting, as fact, a technical error in the documentation that you linked to. I assume you are referring to this statement:
A GUID is a unique binary number; no other computer in the world will generate a duplicate of that GUID value.
That statement is referring to functions like
NEWID()
in T-SQL and Guid.NewGuid()
in .NET that create new GUID / UUID values, and the intention of them to always generate unique values. However, that is not reality: newly generated GUIDs are not guaranteed to be unique. As you already pointed out, MAC Addresses aren't necessarily unique (they can even be spoofed; more info in the 'Related info' section below). Also, from other Microsoft documentation:- MSDN page for .NET Guid Structure states (emphasis added):A GUID is a 128-bit integer (16 bytes) that can be used across all computers and networks wherever a unique identifier is required. Such an identifier has a very low probability of being duplicated.
- The .NET Guid.NewGuid() method (which is used to generate new GUID / UUID values) calls Win32Native.CoCreateGuid. The documentation for that function states (emphasis added):To a very high degree of certainty, this function returns a unique value – no other invocation, on the same or any other system (networked or not), should return the same value.
Please note that the non-SQL Server documentation doesn't even mention MAC Address. And the documentation for
CoCreateGuid
points to the real function that does the generation: UuidCreate. The documentation for that function states:For security reasons, it is often desirable to keep ethernet addresses on networks from becoming available outside a company or organization. The UuidCreate function generates a UUID that cannot be traced to the ethernet address of the computer on which it was generated. It also cannot be associated with other UUIDs created on the same computer. If you do not need this level of security, your application can use the UuidCreateSequential function, which behaves exactly as the UuidCreate function does on all other versions of the operating system.
The implication here is that MAC Address is specifically not used (unless using
NEWSEQUENTIALID()
). And in fact, generating a few GUIDs in SQL Server via NEWID()
indicates that they are RFC 4122, Version 4 UUIDs, which are extremely likely to be unique. There is a chart here, Random UUID probability of duplicates, that shows just how unlikely it is to have duplicates. However, even a very, very low probability of duplicates is not a guarantee of uniqueness.And so...
There is no guarantee that newly generated GUID / UUID values are unique. And, even if there was a guarantee, the
UNIQUEIDENTIFIER
datatype would still have nothing to do with actual uniqueness (as is shown in Brent's answer). Uniqueness, for one or more columns (i.e. data, not datatypes) can only be enforced by Unique Indexes / Constraints.Related info:
Solomon RutzkySolomon Rutzky51.3k55 gold badges9090 silver badges195195 bronze badges
Because a UNIQUEIDENTIFIER column can store whatever UNIQUEIDENTIFIER you put in it. Take this code:
There's nothing stopping you from doing that - unless you specify something about the table that enforces uniqueness.
Brent OzarBrent Ozar36.6k1919 gold badges116116 silver badges252252 bronze badges
Not the answer you're looking for? Browse other questions tagged sql-serversql-server-2012uniqueidentifieruuid or ask your own question.
By: Greg Robidoux | Last Updated: 2007-09-11 | Comments (4) | Related Tips: 1 | 2 | 3 | 4 | 5 | More >JOIN Tables
Problem
When joining multiple datasets you have always had the ability to use the UNION and UNION ALL operator to allow you to pull a distinct result set (union) or a complete result set (union all). These are very helpful commands when you need to pull data from different tables and show the results as one unified distinct result set. On the opposite side of this it would be helpful to only show a result set where both sets of data match or only where data exists in one of the tables and not the other. This could be done with using different join types, but what other options does SQL Server offer?
Solution
With SQL Server, Microsoft introduced the INTERSECT and EXCEPT operators to further extend what you could already do with the UNION and UNION ALL operators.
- INTERSECT - gives you the final result set where values in both of the tables match
- EXCEPT - gives you the final result set where data exists in the first dataset and not in the second dataset
The advantage of these commands is that it allows you to get a distinct listing across all of the columns such as the UNION and UNION ALL operators do without having to do a group by or do a comparison of every single column.
Like the UNION and UNION ALL operators the table structures need to be consistent as well as the columns need to have compatible data types.
Let's take for example we have two tables manager and customer. Both of these tables have somewhat the same structure such as the following columns:
- FirstName
- LastName
- AddressLine1
- City
- StateProvinceCode
- PostalCode
Here is the Manager table sample data:
Here is the Customer table sample data:
We want to do two queries:
- Find the occurrences where a manager is a customer (intersect)
- Find the occurrences where the manager is not a customer (except)
SQL Server INTERSECT Examples
If we want to find out which people exist in both the customer table and the manager table and get a distinct list back we can issue the following command:
Here is the result set:
To do this same thing with a regular T-SQL command we would have to write the following:
SQL Server EXCEPT Examples
If we want to find out which people exists in the manager table, but not in the customer table and get a distinct list back we can issue the following command:
Here is the result set:
To do this same thing with a regular T-SQL command we would have to write the following:
From the two examples above we can see that using the EXCEPT and INTERSECT commands are much simpler to write then having to write the join or exists statements.
To take this a step further if we had a third table (or forth...) that listed sales reps and we wanted to find out which managers were customers, but not sales reps we could do the following.
Here is the SalesRep table sample data:
Here is the result set:
As you can see this is pretty simple to mix and match these statements. In addition, you could also use the UNION and UNION ALL operators to further extend your final result sets.
Next Steps
- Take a look at your existing code to see how the INTERSECT and EXCEPT operators could be used
- Keep these new operators in mind next time you need to compare different datasets with like data
Last Updated: 2007-09-11
About the author
Greg Robidoux is the President of Edgewood Solutions and a co-founder of MSSQLTips.com.
View all my tips
View all my tips
Related Resources