Create VIEW (count duplicate values in column) - sql

I have little project with SQL database which has table with some column.
Question: How create View in SQL Server, which count how many duplicate values I have in column and show that number in next column.
Here below you can see result which I want to take.
|id|name|count|
|1 |tom | |
|2 |tom | |
|3 |tom | |
| | | 3 |
|4 |leo | |
| | | 1 |

A view is simply a select statement with the words CREATE VIEW AS before the SELECT. This allows for example, 1 person (DBA) to maintain (create/alter) complex views, while another person (developer) only has the rights to select from them.
So to use #Stidgeon's answer (below):
CREATE VIEW MyCounts
AS
SELECT name, COUNT(id) AS counts
FROM table
GROUP BY name
and later you can query
Select * from MyCounts where counts > 1 order by name
or whatever you need to do. Note that order by is not allowed in views in SQL SERVER.

You can do what you want with grouping sets:
select id, name, count(*)
from t
group by grouping sets ((id, name), (name));
The group by on id, name is redundant; the value should always be "1". However, this allows the use of grouping sets, which is a convenient way to phrase the query.

Looks like you just want to count how many entries you have for each 'name', in which case you just need to do a simple COUNT query:
CREATE VIEW view_name AS
SELECT name, COUNT(id) AS counts
FROM table
GROUP BY name
The output in your case would be:
name counts
--------------
Tom 3
Leo 1

Related

PostgreSQL Count DISTINCT from one column when grouped by another

I have a single table that looks like the following (dumbed down):
userid | action | userstate
-----------------------------------------------------
1 | click | Maryland
2 | press | Delaware
3 | jog | New York
3 | leap | New York
What I'm trying to query is "number of users doing ANY action, per state"
So the result would be:
state | users_acting
---------------------
Maryland | 1
Delaware | 1
New York | 1
Note that individual users will only be part in one state.
I can't get the mix of distinct users correct with grouping by state. I can't
SELECT DISTINCT (userid), COUNT(userid) FROM data GROUP BY state
because the distinct column needs to be in the group by, which I don't want to actually do, not to mention problems w/ the select clause.
Thanks for any thoughts.
Just found out that there's a COUNT(DISTINCT( option which doesn't require that distinct value to be placed in the grouping clause.
SELECT COUNT(DISTINCT userid) FROM data GROUP BY state
Does the trick
You can try out the below format
SELECT COUNT(DISTINCT userid) FROM data GROUP BY state

SQL query to get latest user to update record

I have a postgres database that contains an audit log table which holds a historical log of updates to documents. It contains which document was updated, which field was updated, which user made the change, and when the change was made. Some sample data looks like this:
doc_id | user_id | created_date | field | old_value | new_value
--------+---------+------------------------+-------------+---------------+------------
A | 1 | 2018-07-30 15:43:44-05 | Title | | War and Piece
A | 2 | 2018-07-30 15:45:13-05 | Title | War and Piece | War and Peas
A | 1 | 2018-07-30 16:05:59-05 | Title | War and Peas | War and Peace
B | 1 | 2018-07-30 15:43:44-05 | Description | test 1 | test 2
B | 2 | 2018-07-30 17:45:44-05 | Description | test 2 | test 3
You can see that the Title of document A was changed three times, first by user 1 then by user 2, then again by user 1.
Basically I need to know which user was the last one to update a field on a particular document. So for example, I need to know that User 1 was the last user to update the Title field on document A. I don't really care what time it happened, just the document, field, and user.
So sample output would be something like this:
doc_id | field | user_id
--------+-------------+---------
A | Title | 1
B | Description | 2
Seems like it should be fairly straightforward query to write but I'm having some trouble with it. I would think that group by would be in order but the problem is that if I group by doc_id I lose the user data:
select doc_id, max(created_date)
from document_history
group by doc_id;
doc_id | max
--------+------------------------
B | 2018-07-30 15:00:00-05
A | 2018-07-30 16:00:00-05
I could join these results table back to the document_history table but I would need to do so based on the doc_id and timestamp which doesn't seem quite right. If two people editing a document at the exact same time I would get multiple rows back for that document and field. Maybe that's so unlikely I shouldn't worry about it, but still...
Any thoughts on a way to do this in a single query?
You want to filter the records, so think where, not group by:
select dh.*
from document_history
where dh.created_date = (select max(dh2.created_date) from document_history dh2 where dh2.doc_id = dh.doc_id);
In most databases, this will have better performance than a group by, if you have an index on document_history(doc_id, created_date).
If your DBMS supports window functions (e.g. PostgreSQL, SQL Server; aka analytic function in Oracle) you could do something like this (SQLFiddle with Postgres, other systems might differ slightly in the syntax):
http://sqlfiddle.com/#!17/981af/4
SELECT DISTINCT
doc_id, field,
first_value(user_id) OVER (PARTITION BY doc_id, field ORDER BY created_date DESC) as last_user
FROM get_last_updated
first_value() OVER (... ORDER BY x DESC) orders the window frames/partitions descending and then takes the first value which is your latest time stamp.
I added the DISTINCT to get your expected result. The window function just adds a new column to your SELECT result but within the same partition with the same value. If you do not need it, remove it and then you are able to work with the origin data plus the new won information.

Searching a "vertical" table in SQLite

Tables are usually laid out in a "horizontal" fashion:
+-----+----+----+--------+
|recID|FirstName|LastName|
+-----+----+----+--------+
| 1 | Jim | Jones |
+-----+----+----+--------+
| 2 | Adam | Smith |
+-----+----+----+--------+
Here, however, is a table with the same data in a "vertical" layout:
+-----+-----+----+-----+-------+
|rowID|recID| Property | Value |
+-----+-----+----+-----+-------+
| 1 | 1 |FirstName | Jim | \
+-----+-----+----+-----+-------+ These two rows constitute a single logical record
| 2 | 1 |LastName | Jones | /
+-----+-----+----+-----+-------+
| 3 | 2 |FirstName | Adam | \
+-----+-----+----+-----+-------+ These two rows are another single logical record
| 4 | 2 |LastName | Smith | /
+-----+-----+----+-----+-------+
Question: In SQLite, how can I search the vertical table efficiently and in such a way that recIDs are not duplicated in the result set? That is, if multiple matches are found with the same recID, only one (any one) is returned?
Example (incorrect):
SELECT rowID from items WHERE "Value" LIKE "J%"
returns of course two rows with the same recID:
1 (Jim)
2 (Jones)
What is the optimal solution here? I can imagine storing intermediate results in a temp table, but hoping for a more efficient way.
(I need to search through all properties, so the SELECT cannot be restricted with e.g. "Property" = "FirstName". The database is maintained by a third-party product; I suppose the design makes sense because the number of property fields is variable.)
To avoid duplicate rows in the result returned by a SELECT, use DISTINCT:
SELECT DISTINCT recID
FROM items
WHERE "Value" LIKE 'J%'
However, this works only for the values that are actually returned, and only for entire result rows.
In the general case, to return one result record for each group of table records, use GROUP BY to create such groups.
For any column that does not appear in the GROUP BY clause, you then have to choose which rowID in the group to return; here we use MIN:
SELECT MIN(rowID)
FROM items
WHERE "Value" LIKE 'J%'
GROUP BY recID
To make this query more efficient, create an index on the recID column.

SQL find name doubles and sum values

I have one table records with the columns:
|rec.id|rec.name|user.name|hours|
and the values respectively:
|1 |google |Admin | 12 |
|2 |yahoo |Admin | 1 |
|3 |bing |Manager | 4 |
What i want to do is take all of the records with the same user.id and sum there hours together in SQL. Perhaps its the early mornign but i cant seem to figure out a way of doing this. I thought about using sql to find the duplicates but thats only going to return a number and not what i want to do with them. This sounds like a really simple thing so sorry in advance.
select user_name,
sum(hours)
from your_table
group by user_name;
You would group on the user name and use the sum aggregate on the hours:
select [user.name], sum(hours) as hours
from TheTable
group by [user.name]

How to get row count in all rows?

select id from table;
+------+
| id |
+------+
| 774 |
| 2775 |
+------+
return 2 rows
select count(id) as count, id from table;
+-------+-----+
| count | id |
+-------+-----+
| 2 | 774 |
+-------+-----+
but return 1 row
How to return all rows, but with counter in each record ?
SQL ???
+-------+------+
| count | id |
+-------+------+
| 2 | 774 |
| 2 | 2775 |
+-------+------+
SELECT id, (select count(*) from table) AS TotalRows
FROM table;
Although this seems unnecessary, as the total count will not change per row.
Use a group by
select id, count(id)
from table
group by id;
(BTW, your SQL in question does not work, at least in oracle and AFAIK in MySql)
I'm not sure what you're trying to do, but if you're trying to fetch the rows and get the total count in the same query because its a resource-intensive and you don't want to repeat your joins/conditions/whatever in two queries, under MySQL you can do:
# Returns a regular results set
SELECT SQL_CALC_FOUND_ROWS foo, bar FROM baz WHERE qux = 'corge' LIMIT 2;
# Returns the total count of found rows (without the LIMIT)
SELECT FOUND_ROWS();
If you want the total number of rows after the LIMIT, or don't have a LIMIT at all, you can skip the SQL_CALC_FOUND_ROWS.
However, generally speaking, counting the total number of rows doesn't scale very well. If you can, find an alternative way that doesn't require you to do that. for example, if its for paging, consider showing only 'next' / 'prev' buttons, without displaying the total number of pages. If you have 30 rows in a page, you can LIMIT 31 instead of 30, only display the first 30 rows, and check if the 31th row exists to know if a 'next' button should be displayed.
if you are useing oracle database you can use count Analytic function also for achieve this task as follow -
SELECT COUNT(*) OVER (PARTITION BY 1) AS COUNT, ID FROM TABLE