How to create the following data structure in an SQL environment - sql

I have a FireStore database and I need to convert it to SQL. My SeminarsAndTraining document looks like this:
{
"st_name": "SOL Level 1",
"attendance": [
{"date": "01/29/2021", "present": ["9103", "1282"], "absent": ["8127"]},
{"date": "01/29/2021", "present": ["1203", "1224"], "absent": ["0927"]}
]
}
I have multiple of these SeminarsAndTraining documents inside a collection. The object inside the attendance array contains the date for the attendance and the students' id is stored in the present and absent array.
Problem 1
I know you can't have arrays in SQL, so what is the best approach to this?.
"attendance": [
{"date": "01/29/2021", "present": ["9103", "1282"], "absent": ["8127"]},
{"date": "01/29/2021", "present": ["1203", "1224"], "absent": ["0927"]}
]

In a relational database you'll typically have these tables
SeminarsAndTraining, which stores st_name
SeminarsAndTraining_attendance, which stores the date of each attendance, and the ID of the SeminarsAndTraining it belong to.
SeminarsAndTraining_attendance_present, which stores each ID from the present field, and the ID of the SeminarsAndTraining_attendance it belong too.
SeminarsAndTraining_attendance_absent, which stores each ID from the abssent field, and the ID of the SeminarsAndTraining_attendance it belong too.
You could probably merge the last two tables, and include a present_or_absent value for each.

Related

Update list of dates in SQL

I have a controller to make a room which needs a JsonBody in order to add the room:
{
"roomName": "Sol",
"properties": "Geluidsdichte kamer",
"capacity": 40,
"buildingName": "16A",
"location": "Leuven",
"reservableDates": ["2022-12-03", "2022-12-04", "2022-12-05"],
"imageUrl":"www"
}
Here we find a reservableDates object which is just a list of dates when the room is available for reservation
Now the backend code to put this code into the database isn't relevant for my problem so I will not state this here.
However the output I get in my database is this...
select * from rooms inner join rooms_reservable_dates
on room_id = rooms_room_id;
Now I have another function in my backend so I can update a room (For example change its available reservable dates, but the problem is that I don't know how to write the query so I can change the reservable dates while also updating the roomName for example.
I'm using JpaRepository in SpringBoot so I have to make a custom query for this.
In Postgresql I have 2 tables Rooms (with all the properties found in the picture except for the reservable_dates) and the other table is rooms_reservable_dates (which has the roomId and the dates that the room is available.
Thank you very much

How to group by the amount of values in an array in postgresql

I have a posts table with few columns including a liked_by column which's type is an int array.
As I can't post the table here I'll post a single post's JSON structure which comes as below
"post": {
"ID": 1,
"CreatedAt": "2022-08-15T11:06:44.386954+05:30",
"UpdatedAt": "2022-08-15T11:06:44.386954+05:30",
"DeletedAt": null,
"title": "Pofst1131",
"postText": "yyhfgwegfewgewwegwegwegweg",
"img": "fegjegwegwg.com",
"userName": "AthfanFasee",
"likedBy": [
3,
1,
4
],
"createdBy": 1,
}
I'm trying to send posts in the order they are liked (Most Liked Posts). Which should order the posts according to the number of values inside the liked_by array. How can I achieve this in Postgres?
For a side note, I'm using Go lang with GORM ORM but I'm using raw SQL builder instead of ORM tools. I'll be fine with solving this problem using go lang as well. The way I achieved this in MongoDB and NodeJS is to group by the size of liked by array and add a total like count field and sort using that field as below
if(sort === 'likesCount') {
data = Post.aggregate([
{
$addFields: {
totalLikesCount: { $size: "$likedBy" }
}
}
])
data = data.sort('-totalLikesCount');
} else {
data = data.sort('-createdAt') ;
}
Use a native query.
Provided that the table column that contains the sample data is called post, then
select <list of expressions> from the_table
order by json_array_length(post->'likedBy') desc;
Unrelated but why don't you try a normalized data design?
Edit
Now that I know your table structure here is the updated query. Use array_length.
select <list of expressions> from public.posts
order by array_length(liked_by, 1) desc nulls last;
You may also wish to add a where clause too.

N1QL query count for each document of specific type

I am new to couchbase and to non-relational DB.
I have a bucket with players and teams(2 types of documents).
each player has type, playedFor(an array with all the teams he played) and a name for example:
{
"type":"player"
"name":"player1"
"playedFor": [
"England/Manchester/United"
"England/Manchester/City"
]
}
each team has type, name and category for example:
{
"type": "team"
"name": "England/Manchester/City"
"category": "FC"
}
I want to know how many players played for each team of category FC.
I made this query to calc for specific team:
SELECT COUNT(1) AS total
FROM bucket AS a
WHERE a.type='player'
AND (any r in a.playedFor satisfies r in ["England/Manchester/United"] end)
but how can i make this query for all teams?
The wrinkle in the way you've modeled this data is that player can play for 1 or more teams (hence the array).
One way to approach this is to use Couchbase's UNNEST clause to "flatten" these arrays (it's basically joining the document to each of the items in the array).
At that point, it becomes as easy as a standard GROUP BY. Here's an example:
SELECT team, count(1) AS totalPlayers
FROM `bucket` AS a
UNNEST a.playedFor team
WHERE a.type='player'
GROUP BY team
This query would generate output like:
[
{
"team": "Pittsburgh/Pirates",
"totalPlayers": 8
},
{
"team": "England/Manchester/United",
"totalPlayers": 10
},
{
"team": "England/Manchester/City",
"totalPlayers": 15
},
{
"team": "Cincinnati/Reds",
"totalPlayers": 21
}
]
(Sorry, I used MLB teams to augment your sample, since I don't know much about soccer teams).
Notice that the separate team documents don't figure into this query, but you could also JOIN to them if you need information from them for your quer(ies).

Retrieve the count of each record (id) with a condition within CosmosDB

I have a container within CosmosDB that houses items. I am needing to find out the count of how many records I have within my container with the conditions of: Source and Date
This is a sample JSON schema in which each of my records/items holds. Each record has a unique id and acts as a single count.
{
"id": "1111111111122222222233333333",
"feedback": {
"Source": "test"
"Date": "1980-10-15T00:04:34Z",
"Ser": "test",
"Count_Of_Comments": "1",
"Count_Of_Votes": "1"
}
The container within CosmosDB looks like something like this:
Goal:
**I wish to return, the numb*er of id records (or the count) based on the Source and the Date.
This is what I have tried (below), however this does not seem to work and I am wondering if I am missing something here. Any help or suggestions are appreciated.
SELECT VALUE COUNT(c.id), c.Source, c.Date
FROM C
Where Source == "test", AND Date == "1980-10-15T00:04:34Z"
As David comments,there are some syntax errors.Please try this sql:
SELECT value COUNT(c.id) FROM c Where c.feedback.Source = "test" AND c.feedback.Date = "1980-10-15T00:04:34Z"
If you need Source and Date,you can try this:
SELECT COUNT(c.id) AS Count,max(c.feedback.Source) as Source,max(c.feedback.Date) as Date
FROM c
Where c.feedback.Source = "test" AND c.feedback.Date = "1980-10-15T00:04:34Z"
By the way,both COUNT(c.id) AND COUNT(1) can achieve your goal in your situation.More detail about SQL Query,you can refer to this documentation.
Hope this can help you.

How to query and iterate over array of structures in Athena (Presto)?

I have a S3 bucket with 500,000+ json records, eg.
{
"userId": "00000000001",
"profile": {
"created": 1539469486,
"userId": "00000000001",
"primaryApplicant": {
"totalSavings": 65000,
"incomes": [
{ "amount": 5000, "incomeType": "SALARY", "frequency": "FORTNIGHTLY" },
{ "amount": 2000, "incomeType": "OTHER", "frequency": "MONTHLY" }
]
}
}
}
I created a new table in Athena
CREATE EXTERNAL TABLE profiles (
userId string,
profile struct<
created:int,
userId:string,
primaryApplicant:struct<
totalSavings:int,
incomes:array<struct<amount:int,incomeType:string,frequency:string>>,
>
>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ( 'ignore.malformed.json' = 'true')
LOCATION 's3://profile-data'
I am interested in the incomeTypes, eg. "SALARY", "PENSIONS", "OTHER", etc.. and ran this query changing jsonData.incometype each time:
SELECT jsonData
FROM "sampledb"."profiles"
CROSS JOIN UNNEST(sampledb.profiles.profile.primaryApplicant.incomes) AS la(jsonData)
WHERE jsonData.incometype='SALARY'
This worked fine with CROSS JOIN UNNEST which flattened the incomes array so that the data example above would span across 2 rows. The only idiosyncratic thing was that CROSS JOIN UNNEST made all the field names lowercase, eg. a row looked like this:
{amount=1520, incometype=SALARY, frequency=FORTNIGHTLY}
Now I have been asked how many users have two or more "SALARY" entries, eg.
"incomes": [
{ "amount": 3000, "incomeType": "SALARY", "frequency": "FORTNIGHTLY" },
{ "amount": 4000, "incomeType": "SALARY", "frequency": "MONTHLY" }
],
I'm not sure how to go about this.
How do I query the array of structures to look for duplicate incomeTypes of "SALARY"?
Do I have to iterate over the array?
What should the result look like?
UNNEST is a very powerful feature, and it's possible to solve this problem using it. However, I think using Presto's Lambda functions is more straight forward:
SELECT COUNT(*)
FROM sampledb.profiles
WHERE CARDINALITY(FILTER(profile.primaryApplicant.incomes, income -> income.incomeType = 'SALARY')) > 1
This solution uses FILTER on the profile.primaryApplicant.incomes array to get only those with an incomeType of SALARY, and then CARDINALITY to extract the length of that result.
Case sensitivity is never easy with SQL engines. In general I think you should not expect them to respect case, and many don't. Athena in particular explicitly converts column names to lower case.
You can combine filter with cardinality to filter array elements having incomeType = 'SALARY' more than once.
This can be further improve so that intermediate array is not materialized by using reduce (see examples in the docs; I'm not quoting them here, since they do not directly answer your question).