Google BigQuery create simple count table - sql

I want to user Google Bigquery to store number of searches for certain keyword in my site. I create table structure like this:
| date | keyword | number_of_searches |
| 2017-03-29 | pizza | 1 |
I want to increment number_of_searches value if combination of date and keyword already exists.

So you wanted a solution to store number of searches for certain keyword.
Using BigQuery technology you need to change some approach.
Let's discuss traditional steps:
- use a SELECT to find out if there is a row for today
- if not, then INSERT one with default
- when exists, use an UPDATE statement to increment counter
With BigQuery where the main change is that it's append only and it's not suitable to do UPDATE statements, you need to change how you analyse data and simplify the collection. Instead of doing 3 steps like above you would do one:
- insert a new row for each search
This way you will end up with multiple rows, and you can aggregate to find out the value and can run a query to find out how many searches you had for your queries, it would be something like this:
SELECT
myday as date,
keyword,
count(1) as number_of_search
FROM table
group by 1,2

Related

BigQuery Create Table Query from Google Sheet with Variable item string field into Repeated Field

I hope I explain this adequately.
I have a series of Google Sheets with data from an Airtable database. Several of the fields are stringified arrays with recordIds to another table.
These fields can have between 0 and n - comma separated values.
I run a create/overwrite table SELECT statement to create native BigQuery tables for reporting. This works great.
Now I need to add the recordIds to a Repeated field.
I've manually written to a repeated field using:
INSERT INTO `robotic-vista-339622.Insurly_dataset.zzPOLICYTEST` (policyID, locations, carrier)
VALUES ('12334556',[STRUCT('recordId1'),STRUCT('recordId2')], 'name of policy');
However, I need to know how I to do this using SELECT statement rather than INSERT. I also need to know how to do this if you do not know the number of recordIds that have been retrieved from Airtable. One record could have none and another record could have 10 or more.
Any given sheet will look like the following, where "locations" contains the recordIds I want to add to a repeated field.
SHEETNAME: POLICIES
|policyId |carrier | locations |
|-----------|-----------|---------------------------------|
|recrTkk |Workman's | |
|rec45Yui |Workman's |recL45x32,recQz70,recPrjE3x |
|recQb17y |ABC Co. |rec5yUlt,recIrW34 |
In the above, the first row/record has no location Id's. And then three and two on the subsequent rows/records.
Any help is appreciated.
Thanks.
I'm unsure if answering my own question is the correct way to show that it was solved... but here is what it took.
I create a Native table in BigQuery. the field for locations is a string, mode repeated.
Then I just run an overwrite table SELECT statement.
SELECT recordId,Name, Amount, SPLIT(locations) as locations FROM `projectid.datasetid.googlesheetsdatatable`;
Tested and I run linked queries on the locations with unnest.

Postgresql query using multiple WHERE conditions

I am wondering if there is a simple / smart way to pass a query to a Postgresql database. I have a database whose headers look something like this:
measurementPointID | parameterA | parameterB | measurement | measurementTIME
There are some dozens of records within the database.
I would like to pass a query that retrieves data only for a set of measurementPointID's. There are several dozens of thousands of measurementPointID's values that I need to retrieve and I have all of these available in, for example, an CSV file.
The query should do a GROUP BY measurementTIME and ORDER BY measurementTIME as well. One detail is that if the measurement is zero (measurement = 0) there is no row corresponding to the measurementPointID at all.
Am I trying to do something too complicated or in a stupid way?

Changing only one value of a column wherein there are multiple data of the same value in SQL

For example, I have a table:
User ID(int) | Card ID(int) | Deck(int)
1841 | 14 | 1
1841 | 14 | 1
it is defined that the int values in deck column would always take on 1 or 2 as a value(1 indicating that it is in the deck). and card ID is not unique for a user(this indicate that a user have 2 card 14) , as shown in the example above. what if i want to remove one card 14 in the deck and the other would remain. what is the proper sql command, i tried UPDATE but it
you can define limit at the end of update query
update [table name] set Deck=2 where User_ID=1841 and Card_id=14 limit 1;
Basically you're missing a way of referencing any single particular row. Depending how critical to the application is need for such reference, it is almost always bad idea to allow such situation. There are many solutions for this, for example
1) Every row usually contains unique OID or ROWID field , which is not displayed with "SELECT * FROM TABLE", but can be used if requested implicitly. Depending on what database engine you use, e.g. with PostgreSQL try
SELECT OID, * FROM TABLE WHERE OID = 'somevalue'
this is usually used if you don't want to enforce UNIQUE on the table, but rather deal with possible mistaken input later if it will unfortunately appear.
2) You can add ID column, for example autoincremental ( refer to DB manual ), and then update it to contain unique IDs
ALTER TABLE table_name ADD column_name column-definition;
3) You can use self incrementing "running total", eg. with MySQL it looks more/less like this:
SET #runtot:=0;
SELECT *, (#runtot := #runtot + 1) AS rt FROM table WHERE rt='somevalue'
(this will do calculation every time so probably will be inefficient )
4) You can use LIMIT as explained in previous answer
5) You can JOIN some another table with unique IDs and possibly update resulting relation, or combine some query to create and use static VIEW
6) You can use SELECT with some dynamically allocated value, for example RAND() or NOW() or similar. It probably won't create unique identifiers across whole table, depending what function you'll use and how you will use it
7) combine two or more above solutions altogether
..and probably many other solutions. Anyway usually there's some "Id" column used with some UNIQUE constraint.

Query - If more than one record ID# (non primary key) matches, use later date or larger primary key

I built a MS Access Database that takes a survey to create a custom report. The survey application that was used does not give us the reports we need. I usually grab the data (excel) and import it in access and build report the way we need them.
For this first time, we have people redoing the survey because they are updating something or they forgot to add something. I need to be able to grab the most recent surveys data so we don't get a duplicate when we run the report. (My main report is composed of several subreports. Some subreports will not visible if null, and any questions not answered are hidden and shrinked to prevent bulky reports with unnecessary whitespace.)
record ID (PK) | FName | LName | IDNum | Completed
1 | Bob | Smith | 57 | 3/31/2013 5:00pm
2 | Bob | Smith | 57 | 3/31/2013 7:00pm
I want record ID 2 or the one that was completed at 7pm.
The queries and reports are already completed so i have been trying to find a way to add a line of code in the criteria line of my query to grab the most recent record if the IDnum matches with more than one record.
I have been trying to find the best way to do it for the past several hours. I don't think that having my table be modified to 'table without duplicates' as after the database is complete, someone less technical will be using it. All they are going to do is import a new excel file to overwrite the table and the queries do everything to build the report. I don't want to manually delete the duplicate records either.
I know I need to do something along the lines with
IIF(count(IDNum)>1, *something, *something)
*I get stuck on the true and false part. How do i tell access that it needs to check within the table again to find the record with the larger primary key?
I thought this was going to be easy but i guess i was wrong. lol
I am fairly new at MS Access so I know I am not using the full potential and i might be going at this at the wrong angle. Any advice would be appreciated greatly.
I'm a student going into Info Systems, so i would really like to learn how to do this.
I believe the query you are looking for is
SELECT t1.*
FROM YourTable t1 INNER JOIN
(SELECT IDNum, MAX(Completed) AS MaxOfCompleted
FROM YourTable GROUP BY IDNum
) t2
ON t1.IDNum = t2.IDNum AND t1.Completed = t2.MaxOfCompleted;
When you are using an if function it should be iif not iff.
I'd recommend a correlated subquery, such as the following:
SELECT
Data.RecordID
, Data.FName
, Data.LName
, Data.IDNum
, Data.Completed
FROM
Data
WHERE
Data.Completed IN
(
SELECT TOP 1
DataSQ.Completed
FROM
Data as DataSQ
WHERE
DataSQ.IDNum = Data.IDNum
GROUP BY
DataSQ.Completed
ORDER BY
DataSQ.Completed DESC
)
GROUP BY
Data.RecordID
, Data.FName
, Data.LName
, Data.IDNum
, Data.Completed
;
Explanation
Instead of using a function such as Max or IIF, you can embed another SELECT query within the WHERE clause of your main query. The nested query is used to determine the most recent Completed date for every IDNum. Unlike selecting the most recent survey directly from your table with SELECT TOP 1 + ORDER BY, which would only return one record, the WHERE clause in your nested query refers back to the main query and produces a result for each IDNum. This is known as the Top N per Group pattern, and I've found it to be very useful. Note that in the nested query you will need to use a table name alias so that Access will be able to differentiate between the two queries.
Also, I'd generally recommend against trying to use a table PK to perform sorts. There are many cases when the PK order value will not be a good indicator of the values of related fields.
This code worked when tested on dummy data - best of luck!

How can I reorder rows in sql database

Is it possible to reorder rows in SQL database?
For example; how can I swap the order of 2nd row and 3rd row's values?
The order of the row is important to me since i need to display the value according to the order.
Thanks for all the answers. But 'Order by' won't work for me.
For example, I put a list of bookmarks in database.
I want to display based on the result I get from query. (not in alphabet order). Just when they are inserted.
But user may re-arrange the position of the bookmark (in any way he/she wants). So I can't use 'order by'.
An example is how the bookmark display in the bookmark in firefox. User can switch position easily. How can I mention that in DB?
Thank you.
It sounds like you need another column like "ListOrder". So your table might look like:
BookMark ListOrder
======== =========
d 1
g 2
b 3
f 4
a 5
Then you can "order by" ListOrder.
Select * from MyTable Order By ListOrder
If the user can only move a bookmark one place at a time, you can use integers as the ListOrder, and swap them. For example, if the user wants to move "f" up one row:
Update MyTable
Set ListOrder=ListOrder+1
Where ListOrder=(Select ListOrder-1 From MyTable where BookMark='f')
Update MyTable
Set ListOrder=ListOrder-1
Where BookMark='f'
If the user can move a bookmark up or down many rows at once, then you need to reorder a segment. For example, if the user wants to move "f" to the top of the list, you need to:
if (increment) {
update MyTable
Set ListOrder=ListOrder-1
where ListOrder<=1 -- The New position
and ListOrder >(Select ListOrder from MyTable where BookMark='f')
} else {
update MyTable
Set ListOrder=ListOrder+1
where ListOrder>=1 -- The New position
and ListOrder <(Select ListOrder from MyTable where BookMark='f')
}
update MyTable
Set ListOrder=1 -- The New Position
Where Bookmark='f'
As others have mentioned, it's not a good idea to depend on the physical order of the database table. Relational tables are conceptually more like unordered sets than ordered lists. Assuming a certain physical order may lead to unpredictable results.
Sounds like what you need is a separate column that stores the user's preferred sort order. But you'll still need to do something in your query to display the results in that order.
It is possible to specify the physical order of records in a database by creating a clustered index, but that is not something you'd want to do on an arbitrary user-specified basis. And it may still lead to unexpected results.
Use ORDER BY in your SELECT query. For example, to order by a user's last name, use:
SELECT * FROM User ORDER BY LastName
The order of the rows on the actual database should not matter.
You should use the ORDER BY clause in your queries to order them as you need.
Databases can store the data in any way they want. Using the "order by" clause is the only way to guarantee an ordering of the data. In your bookmark example, you could have an integer field that indicates the ordering, and then update that field as a user moves things around. Then ORDER BY that column to get things in the right order.
A little late to the party, but anyone still looking for an answer to this problem, you need to use the Stern-Brocot technique.
Here's an article explaining the theory behind it
For each item you need to store a numerator and denominator. Then you can also add a computed column which is the division of both. Each time you move an item inbetween 2 others, the item's numerator becomes the sum of both neighboring numerators, and the item's denominator becomes the sum of both neighboring denominators.
These numbers won't skyrocket as fast as with the "averaging" method, where you lose all accuracy after 17 swaps.
I also created a demo where the method is implemented.
I have a solution for this that I have used a few times. I keep an extra field "sort_order" in the table, and update this when reordering. I've used this in cases when I have some sort of containers with items, and the order of the items should be editable inside the container. When reordering, I only update the sort_order for the items in the current container, which means not to many (usually in practice only a few) rows have to be updated.
In short, I do the following:
add a sort_order field to the items table
when inserting a new row, I set sort_order=id
when reordering (needs id of item to move, and id of item to insert after):
select id, sort_order from items where container = ID order by sort_order
split the id and sort_order from rows in two arrays
remove the id of the item to move from the id-list
insert the id of the item to move after the id of the item to insert after
merge the list of ids and the list of sort_order into a two dimensional array, as [[id, sort_order], [id2, sort_order], ...]
run update item set sort_order=SORT_ORDER where id=ID (executemany) with merged list
(If moving item to another container, after updating "container foreign key" move first or last depending on app.)
(If the update involves a large number of items, I do not think this solution is a good approach.)
I have made an example using python and mysql on http://wannapy.blogspot.com/2010/11/reorder-rows-in-sql-database.html (copy and try it) along with some extra explanations.
I guess a simple order by would be what you're looking for?
select my_column from my_table order by my_order_column;
As others have stated use an order by.
Never depend on the order data exists in a physical table, always base it of the data you are working with, be it one or more key fields.
First, let me agree with everyone here that the order in the table shouldn't matter. Use a separate [SortOrder] column that you update and include an Order By clause.
That said, SQL Server databases do allow for a single "clustered index" on a table that will actually force the position in the underlying table storage. Primarily useful if you have a big dataset and always query by something specific.
Add a position column to your table and store as a simple integer.
If you need to support multiple users or lists, your best bet is to create a bookmarks table, an users table and a table to link them.
bookmarks: id,url
users: id,name
users_bookmarks: user_id, bookmark_id, position, date_created
Assuming date_created is populated when inserting rows you can get secondary list ordering based on date.
select bookmark_id from users_bookmarks where user_id = 1 order by position, date_created;
At times like this, I am reminded of a quote from the Matrix: "Do not try and order the database. That's impossible. Instead, only realize the truth... there is no order. Then you will see that it the table that orders itself, it is you who orders the table."
When working with MySQL through a GUI, there is always a decision to make. If you run something like SELECT * FROM users, MySql will always make a decision to order this by some field. Normally, this will be the primary key.
+----------------
| id | name |
-----------------
| 1 | Brian |
| 2 | Carl |
| 3 | Albert |
-----------------
When you add an ORDER BY command to the query, it will make the decision to order by some other field.
For Example Select * From users ORDER BY name would yield:
+----------------
| id | name |
-----------------
| 3 | Albert |
| 1 | Brian |
| 2 | Carl |
-----------------
So to your question, you appear to want to change the default order by which your table displays this information. In order to do that, check to see what your Primary Key field
is. For most practical purposes, having a unique identifying natural number tends to do the trick. MySQL has an AUTO_INCREMENT function for this. When creating the table, it would look something like field_name int NOT NULL AUTO_INCREMENT.
All of this is to say: if you would like to change "the row order", you would need to update this value. However, since the identifier is something that other tables would use to reference your field, this seems a little bit reckless.
If you for example went: UPDATE table Set id = 1 where id = 2;, this would initially fail, since the id fields would end up being both an identical value and fail the PrimaryKey check (which insists on both uniqueness and having a value set). You could Juggle this by running three update statements in a row:
UPDATE users Set id = 100000000 where id = 1;
UPDATE users Set id = 1 where id = 2;
UPDATE users Set id = 2 where id = 100000000;
This would result in the rows for this table looking like:
+----------------
| id | name |
-----------------
| 1 | Carl |
| 2 | Brian |
| 3 | Albert |
----------------+
Which technically would work to reorder this table, but this is in a bubble. MySQL being a relational database means that any table which was depending on that data to be consistent will now be pointed to the wrong data. For example, I have a table which stores birthdays, referencing the initial user table. It's structure might look like this:
+----------------------------+
| id | user_id | birthdate |
+----------------------------+
| 1 | 1 | 1993-01-01 |
| 1 | 2 | 1980-02-03 |
| 1 | 3 | 1955-01-01 |
+----------------------------+
By switching the ID's on the user table, you MUST update the user_id value on the birthdays table. Of course MySQL comes prepared for this: enter "Foreign Key Constraints". As long as you configured all of your foreign key constraints to Cascade Updates, you wouldn't need to manually change the reference to every value you changed.
These queries would all be a lot of manual work and potentially weaken your data's integrity. If you have fields you would like to rank and reorder regularly, the answer posed by Mike Lewis on this question with the "table order" would be a more sensible answer (and if that is the case, then his is the best solution and just disregard this answer).
In response to your post here, the answer you may be looking for is:
To order chronologically, add a DateAdded or similar column with a datetime or smalldatetime datatype.
On all methods that insert into the database, make sure you insert CURRENT_TIMESTAMP in the DateAdded column.
On methods that query the database, add ORDER BY DateAdded at the end of the query string.
NEVER rely on the physical position in the database system. It may work MOST of the time but definitely not ALL of the time.
The question lacks any detail that would let anyone give you correct answer. Clearly you could read the records into memory and then update them. But this is bad on so many different levels.
The issue is like this. Depending on the schema that is actually implemented there is logic to the way that the records are physically written to disk. Sometimes they are written in order of insert and other times they are inserted with space between blocks (see extents).
So changing the physical order is not likely without swapping column data; and this has a deep effect on the various indices. You are left having to change the logical order.
As I read your update... I'm left to understand that you may have multiple users and each user is to have bookmarks that they want ordered. Looks like you need a second table that acts as an intersection between the user and the bookmark. Then all you need is an inner join and an order by.
But there is not enough information to offer a complete solution.
Here is a stored procedure script to increment or decrement (one at a time) in MySQL.
Note, MySQL doesn't allow you to select in the same query you're updating so the above answers don't work.
I have also set it to return an error if there is no item above / below if you're incrementing / decrementing, respectively.
DELIMITER $$
CREATE PROCEDURE `spReorderSequenceItems` (
IN _SequenceItemId INT,
IN _SequenceId INT,
IN IncrementUp TINYINT,
OUT Error VARCHAR(255)
)
BEGIN
DECLARE CurrentPosition INT;
SELECT Position INTO CurrentPosition
FROM tblSequenceItems
WHERE SequenceItemId = _SequenceItemId;
IF IncrementUp = 1 THEN
IF (
SELECT Position
FROM tblSequenceItems
WHERE Position = CurrentPosition + 1 AND SequenceId = _SequenceId
) THEN
UPDATE tblSequenceItems
SET Position = Position - 1
WHERE Position = CurrentPosition + 1 AND SequenceId = _SequenceId;
UPDATE tblSequenceItems
SET Position = Position + 1
WHERE SequenceItemId = _SequenceItemId;
ELSE
SELECT 'No Item Above' AS _Error INTO Error;
END IF;
ELSE
IF (
SELECT Position
FROM tblSequenceItems
WHERE Position = CurrentPosition - 1 AND SequenceId = _SequenceId
) THEN
UPDATE tblSequenceItems
SET Position = Position + 1
WHERE Position = CurrentPosition - 1 AND SequenceId = _SequenceId;
UPDATE tblSequenceItems
SET Position = Position - 1
WHERE SequenceItemId = _SequenceItemId;
ELSE
SELECT 'No Item Below' AS _Error INTO Error;
END IF;
END IF;
END
$$
DELIMITER ;
Call it with
CALL spReorderSequenceItems(1, 1, 1, #Error);
SELECT #Error;