Extracting Data from a Multi-Data Column in SQL

Extracting Data from a Multi-Data Column in SQL - sql

I'm creating a sales leaderboard in HOLISTICS and the column "user_id" is a multi-data column.
Here's a snapshot of the column "user_id":
I need to show the "name" part of the user. I tried using CONVERT and even JSON_VALUE but both are not recognized by Holistics.
I used CAST but still the user_id is in numerical form.
Here's the my code:
And here's the data output:
Can you help me on what to do to be able to show the actual name of the sales person?
I'm a newbie here and its my first post that's why all my snipshots are put in a link form.

To select a particular field from a JSON data (and JSON is what you have in user_id column), try this combination:
SELECT
JSON_UNQUOTE(JSON_EXTRACT(user_id,'$.id')) as id
JSON_UNQUOTE(JSON_EXTRACT(user_id,'$.name')) as user_name
FROM public.deals
This should return the user's id and name from your JSON column.
Whatever software you use, it probably expects the data to be retrieved in a row-column format, so you just need to play with the SQL query, so that it returns properly formatted data. And since you have JSONs in a user_id column (which seems weird, but nevermind) - a combination of JSON_EXTRACT, JSON_UNQUOTE and perhaps CAST should do the trick.
But bear in mind, that running DISTINCT on a big table using those methods could be slow.

Related

Write SQL code in BigQuery to get the data in format attribute1:attribute_value1;attribute2:attribute_value2

I am working on the first party data. Each line in the data file for first-party data must contain all the data for a given user and should be delimited using the CARET (^) character.
The file needs to be UTF-8 encoding (or us-ascii)
Each line in the data file should contain 2 columns:
The first column represents the first-party User ID and must match the client first-party User ID that is used in the user matching process described in section 2
The second column contains all the data associated with the user and should be in the following format:
attribute1:attribute_value1;attribute2:attribute_value2
For example, if there are 3 columns called Age Group, Gender, and Interest in the client’s registration database that need to be imported into Audience Studio, then the following represents a valid data file that can be ingested by Audience Studio:
User1234^gender:male;age:18-24;interest:fishing
User2345^gender:female
I have uploaded the csv file in the table in the bigQuery, But i am not able to perform the column formatting using SQL. Can someone please help?

The first step is to split your columns into user_id and attributes columns which can simply be done with the split function. Then you have a few options, here is the simplest one maybe:
SELECT
user_id,
REGEXP_EXTRACT(attributes, "gender:(.*?)(?:;|$)") AS gender,
REGEXP_EXTRACT(attributes, "age:(.*?)(?:;|$)") AS age,
REGEXP_EXTRACT(attributes, "interest:(.*?)(?:;|$)") AS interest,
FROM (
SELECT
SPLIT(rawline, "^")[OFFSET(0)] AS user_id,
SPLIT(rawline, "^")[OFFSET(1)] AS attributes,
FROM
`yourdataset.yourtable`
)
You can also take inspiration from that answer by #Mikhail_Berlyant.

Retrieve all users with recent updates

Sql question.
I have a customer table with:
User id, name, email, phone
The customer can update their name, email and phone at anytime on an app.
How can I find out which user id had changes in name, email or phone number on a particular date?

Since your table doesn't store the date that they made the changes, you can't.
If you add a column with a datetime type (or whatever your specific database product provides) - you could call it LastModified or something like that - then the solution becomes trivial.
I'd give you a specific example, but because you didn't tell us what database engine you use, I can't guarantee to get the syntax right.

This is an issue with RDBMSes, you cannot as they generally store say a "photograph" of your data in time not a "film" of how it got there.
Based on the RDBMS you use, you can introduce an updated_at field which will hold when the last change happened to that row either from the "UPDATE" statement (say 'UPDATE phone=000, updated_at=now() WHERE user_id=999') or set it up to autoupdate see: create column for auto-date in postgresql

using a lookup table on a form with Oracle Apex Item

I have an application that uses Oracle Apex 4.2 . It has a form ( form and report on a table) that needs to display descriptions for columns on the table. For instance, there is a column on the table called fund which has a numeric value ( 1 to 6). There is a separate table that gives a description for each of these 6 values. Under EDIT PAGE ITEM, under SOURCE, I chose SOURCE TYPE -> SQL QUERY
I entered this query below:
SELECT DESCRIPTION FROM
"#OWNER#"."BU19ANT",
"#OWNER#"."FUNDCD"
WHERE ANTFUNDCD = CODE
where BU19ANT is the table that used for this form
FUNDCD is the name of the look up table
ANTFUNDCD and CODE and numeric fields on the respective tables and DESCRIPTION is the value that I want to look up and display on the form.
This gives me the correct answer MOST of the time, but not all the time.
The key to the table ( and the field used to link from the report to the form) is the Soc Security Number. If I run this same query against the Oracle table hard coding the SS Number, I always get the correct answer.
This form has 5 look ups that work this way and they all have the same problem.
I assume that I DONT need to include the Social Security Number as part of the query Apex already knows that.
But I tried to add that and can not figure out how to code it.
I tried
WHERE ANTSOCIALSECURITYNUMBER ( column on table) = P2_SOCIALSECURITYNUMBER ( the item on this page)
but that gave this error
ORA-00904: "P2_SOCIALSECURITYNUMBER ": invalid identifier
Is there some other way to code this? Or to say where SS Number = current record?
Or am I on the wrong track here?

Try :P2_SOCIALSECURITYNUMBER (for items on session) or &P2_SOCIALSECURITYNUMBER. (for items on page)

One to Many - Calculated Column

I am trying to teach myself the new Tabular model for SQL 2012 SSAS to handle some analytic reports that were previously handled in (slow) stored procedures.
I've made decent progress on most of it, just figuring out how things work and how to add the calculations I need but I have been banging my head against the following:
I have a table that has file information -- it has:
ID
FileName
CurrentStatus
UploadedBy
And then a table that has statuses that the file went through (a many relationship to the file table):
FileID
StatusID
TimeStamp
What I'm trying to do is to add a calculated column to the File table that returns the TimeStamp information when a file was in a particular status. ie: StatusID=100 is uploaded. I want to add a calculated column called UploadedDate on the File table that has the associated TimeStamp information from the FileStatus table.
It seems like this should be doable with DAX but I just can't seem to wrap my head around it. Any ideas out there?
In advance, many thanks,
Brent

Here's a formula that should work for what you want to do...
=MAXX(
CALCULATETABLE(
'FileStatus'
,'FileStatus'[StatusID] = 100
)
,'FileStatus'[TimeStamp]
)

I'm assuming each file can only be in each status once (there is only one row per FileID that has StatusID 100). I believe you can just use a lookupvalue formula. The formula for your UploadedDate calculated column would be something like
=LOOKUPVALUE(FileStatus[Timestamp], File[FileID], FileStatus[FileID], FileStatus[StatusID], 100)
Here's the MSDN description of LOOKUPVALUE. You provide the column containing the value you want returned, the column you want to search, and the value you are searching for. You can add multiple criteria to your lookup table. Here's a blog post that contains a good example.

How to design a database table structure for storing and retrieving search statistics?

I'm developing a website with a custom search function and I want to collect statistics on what the users search for.
It is not a full text search of the website content, but rather a search for companies with search modes like:
by company name
by area code
by provided services
...
How to design the database for storing statistics about the searches?
What information is most relevant and how should I query for them?

Well, it's dependent on how the different search modes work, but generally I would say that a table with 3 columns would work:
SearchType SearchValue Count
Whenever someone does a search, say they search for "Company Name: Initech", first query to see if there are any rows in the table with SearchType = "Company Name" (or whatever enum/id value you've given this search type) and SearchValue = "Initech". If there is already a row for this, UPDATE the row by incrementing the Count column. If there is not already a row for this search, insert a new one with a Count of 1.
By doing this, you'll have a fair amount of flexibility for querying it later. You can figure out what the most popular searches for each type are:
... ORDER BY Count DESC WHERE SearchType = 'Some Search Type'
You can figure out the most popular search types:
... GROUP BY SearchType ORDER BY SUM(Count) DESC
Etc.

This is a pretty general question but here's what I would do:
Option 1
If you want to strictly separate all three search types, then create a table for each. For company name, you could simply store the CompanyID (assuming your website is maintaining a list of companies) and a search count. For area code, store the area code and a search count. If the area code doesn't exist, insert it. Provided services is most dependent on your setup. The most general way would be to store key words and a search count, again inserting if not already there.
Optionally, you could store search date information as well. As an example, you'd have a table with Provided Services Keyword and a unique ID. You'd have another table with an FK to that ID and a SearchDate. That way you could make sense of the data over time while minimizing storage.
Option 2
Treat all searches the same. One table with a Keyword column and a count column, incorporating SearchDate if needed.

You may want to check this:
http://www.microsoft.com/sqlserver/2005/en/us/express-starter-schemas.aspx

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas