When and why are the attributes `propertyIds` and `categoryIds` in the product-response - shopware6

I am doing a simple search on some products using the api-endpoint {{endpoint}}/api/search/product. What i sometimes see is that there are some products, where the attributes propertyIds and categoryIds are null, while there are sometimes products where these properties are not null.
However: All the examples have actual categories and properties assigned and are visible through other attributes.
My question is: What is the reason for this and how do this values get populated?
"minPurchase": 1,
"purchaseUnit": null,
"referenceUnit": null,
"shippingFree": false,
"purchasePrices": null,
"markAsTopseller": null,
"weight": null,
"width": null,
"height": null,
"length": null,
"releaseDate": null,
"ratingAverage": null,
"categoryTree": null,
"propertyIds": null,
"optionIds": null,
"streamIds": null,
"tagIds": null,
"categoryIds": null,
"childCount": null,
Example for one with Ids:
"b8a475de8b284e17b0ff4dba3729deff"
],
"propertyIds": [
"2c3257b006e240369ab32334096bca40",
"521eab63f64a47ae9f51801d57b4a0ae",
"6322d1a7de254bec8fe813d4dae43e97",
"8253d82499b44fdfbcea4f0238ba3258",
"a8c03127e8644749814ee6ca0f71cba7",
"b5b467f25ff3402ebbd4264b785153ec",
"d37c8640fd43427795365dae9cb750da"
],
"optionIds": null,
"streamIds": null,
"tagIds": null,
"categoryIds": [
"4c84f6cacaa7417fa18524d78156c9e4",
"b8a475de8b284e17b0ff4dba3729deff"
],
"childCount": 5,
"customFieldSetSelectionActive": null,
"sales": 0,

Both attributes propertyIds and categoryIds contain the IDs of assigned properties (such as color, size, material, etc.) and categories (a structural element used to group products in our navigation).
When one of those fields is null in a product, it simply means that no category or property is assigned to the product.
If you want to read the specific properties of categories of a product, the IDs are useless of course - but sometimes, you just need the ID to build a reference or a link.
If you want to see the actual properties and categories, you specify it in the request body as associations:
// POST /api/search/product
{
"associations": {
"properties": {},
"categories": {}
}
}
Another useful use case for *Ids fields are multi-assignments:
// POST /api/_action/sync
[{
"action": "upsert",
"entity": "product",
"payload": [{
"id": "0b3db9fe80af4d2bb81ecd649983a648",
"propertyIds": [
"13bc59c320a2400ea8d841da15f7b0f8", // Size: XL
"2fbb5fe2e29a4d70aa5854ce7ce3e20b", // Color: red
"0060b9b2b3804244bf8ba98cdad50234" // Material: cotton
]
}]
}]
Also see Bulk Imports

In Shopware there's the concept of indexing frequently read but rarely written data. This may for example be the count of variants for a product or, what you were wondering about, the IDs of associated entities. The indexed data is then updated whenever you update the entity, e.g. the product.
The idea is that unless you update the entity, that indexed data would never change and it is therefore safe to store it right with the entity in its table.
Let's say you then want to evaluate if a product has more than a certain count of variants, is part of a category or has certain properties. Then, instead of doing executing costly database queries to count rows or having a bunch of SQL joins to associated data, you can use these pre-indexed fields for your evaluation.
If you think that you're missing some data that should've been indexed, you can use the CLI command bin/console dal:refresh:index.

Related

How to group by the amount of values in an array in postgresql

I have a posts table with few columns including a liked_by column which's type is an int array.
As I can't post the table here I'll post a single post's JSON structure which comes as below
"post": {
"ID": 1,
"CreatedAt": "2022-08-15T11:06:44.386954+05:30",
"UpdatedAt": "2022-08-15T11:06:44.386954+05:30",
"DeletedAt": null,
"title": "Pofst1131",
"postText": "yyhfgwegfewgewwegwegwegweg",
"img": "fegjegwegwg.com",
"userName": "AthfanFasee",
"likedBy": [
3,
1,
4
],
"createdBy": 1,
}
I'm trying to send posts in the order they are liked (Most Liked Posts). Which should order the posts according to the number of values inside the liked_by array. How can I achieve this in Postgres?
For a side note, I'm using Go lang with GORM ORM but I'm using raw SQL builder instead of ORM tools. I'll be fine with solving this problem using go lang as well. The way I achieved this in MongoDB and NodeJS is to group by the size of liked by array and add a total like count field and sort using that field as below
if(sort === 'likesCount') {
data = Post.aggregate([
{
$addFields: {
totalLikesCount: { $size: "$likedBy" }
}
}
])
data = data.sort('-totalLikesCount');
} else {
data = data.sort('-createdAt') ;
}
Use a native query.
Provided that the table column that contains the sample data is called post, then
select <list of expressions> from the_table
order by json_array_length(post->'likedBy') desc;
Unrelated but why don't you try a normalized data design?
Edit
Now that I know your table structure here is the updated query. Use array_length.
select <list of expressions> from public.posts
order by array_length(liked_by, 1) desc nulls last;
You may also wish to add a where clause too.

How to create the following data structure in an SQL environment

I have a FireStore database and I need to convert it to SQL. My SeminarsAndTraining document looks like this:
{
"st_name": "SOL Level 1",
"attendance": [
{"date": "01/29/2021", "present": ["9103", "1282"], "absent": ["8127"]},
{"date": "01/29/2021", "present": ["1203", "1224"], "absent": ["0927"]}
]
}
I have multiple of these SeminarsAndTraining documents inside a collection. The object inside the attendance array contains the date for the attendance and the students' id is stored in the present and absent array.
Problem 1
I know you can't have arrays in SQL, so what is the best approach to this?.
"attendance": [
{"date": "01/29/2021", "present": ["9103", "1282"], "absent": ["8127"]},
{"date": "01/29/2021", "present": ["1203", "1224"], "absent": ["0927"]}
]
In a relational database you'll typically have these tables
SeminarsAndTraining, which stores st_name
SeminarsAndTraining_attendance, which stores the date of each attendance, and the ID of the SeminarsAndTraining it belong to.
SeminarsAndTraining_attendance_present, which stores each ID from the present field, and the ID of the SeminarsAndTraining_attendance it belong too.
SeminarsAndTraining_attendance_absent, which stores each ID from the abssent field, and the ID of the SeminarsAndTraining_attendance it belong too.
You could probably merge the last two tables, and include a present_or_absent value for each.

MS graph "Multiple Customer" Bookings - Unable to retrieve customer information

How to retrieve multiple customer bookings customer data from MS Graph?
MS Graph is the API to access MS Bookings. I am attempting to retrieve customer information, which works for bookings with single customers. However you can have multiple customers.
API Documentation
Here are the api calls for a single customer, to access bookings.
1 Get User Token
https://login.microsoftonline.com/{{TenantID}}/oauth2/v2.0/token
2 Get List of Bookings for date period
https://graph.microsoft.com/beta/bookingBusinesses/{{BookingBusinessId}}/appointments
3 Get Booking with ID
https://graph.microsoft.com/beta/bookingBusinesses/{{BookingBusinessId}}/appointments/{{BookingId}}
The response object for a single customer booking, will include the following for customer information in the json object. However the multiple, will provide null. Pls Note, I have blanked out most values and represented this with a leading and ending "_".
Single customer booking
"id": "_bookid_",
"selfServiceAppointmentId": "_selfserviceappId_",
"additionalInformation": "",
"isLocationOnline": true,
"onlineMeetingUrl": "_teamsbookingid_",
"customerId": "_CustomerId_",
"customerName": "_customerName_",
"customerEmailAddress": "_email_",
"customerPhone": "_phone_",
"customerNotes": "_notes_",
Multiple customer booking
"id": "[bookid]",
"selfServiceAppointmentId": "[selfserviceappId]",
"additionalInformation": "",
"isLocationOnline": false,
"onlineMeetingUrl": null,
"customerId": null,
"customerName": null,
"customerEmailAddress": null,
"customerPhone": null,
"customerNotes": null,
The Calendar view delta object will instead of listing the customer details, will add the following note.
"NOTE: This is a Multi customer booking. Log into Bookings to see customer information and notes for this event."
Has anyone been able to solve multi customer bookings with MS Graph to retrieve customer data?

Flattening a nested and repeated structure in BigQuery (standard SQL)

There are a lot of posts on unnesting repeated fields in BigQuery -- but, being new to this environment, I have tried almost every code variation I found to flatten a data file. But, I cannot seem to produce one without creating blanks in the id field. It seem like I need to unflatten a nested variable?
I'm using a COVID Dimensions data set that is part of the public collection. Here is some minimal code that produces my problem:
SELECT
id,
authors
FROM
`covid-19-dimensions-ai.data.publications`
CROSS JOIN
UNNEST(authors)
LIMIT 1000
And, here is the JSON structure after running this query. Everything is flattened with the structure I want, but I don't know how to fill in / avoid the blank id variables.
{
"id": "pub.1130234899",
"authors": {
"first_name": "Eric M",
"last_name": "Yoshida",
"initials": null,
"researcher_id": "ur.01071531321.03",
"grid_ids": [
"grid.17091.3e"
],
"corresponding": false,
"raw_affiliations": [
"Division of Gastroenterology, University of British Columbia, Vancouver, British Columbia, Canada"
],
"affiliations_address": [
{
"grid_id": "grid.17091.3e",
"city_id": "6173331",
"state_code": "CA-BC",
"country_code": "CA",
"raw_affiliation": "Division of Gastroenterology, University of British Columbia, Vancouver, British Columbia, Canada"
}
]
}
}
See small correction to your original query
SELECT
id,
author
FROM
`covid-19-dimensions-ai.data.publications`
CROSS JOIN
UNNEST(authors) author
LIMIT 1000

How to query and iterate over array of structures in Athena (Presto)?

I have a S3 bucket with 500,000+ json records, eg.
{
"userId": "00000000001",
"profile": {
"created": 1539469486,
"userId": "00000000001",
"primaryApplicant": {
"totalSavings": 65000,
"incomes": [
{ "amount": 5000, "incomeType": "SALARY", "frequency": "FORTNIGHTLY" },
{ "amount": 2000, "incomeType": "OTHER", "frequency": "MONTHLY" }
]
}
}
}
I created a new table in Athena
CREATE EXTERNAL TABLE profiles (
userId string,
profile struct<
created:int,
userId:string,
primaryApplicant:struct<
totalSavings:int,
incomes:array<struct<amount:int,incomeType:string,frequency:string>>,
>
>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ( 'ignore.malformed.json' = 'true')
LOCATION 's3://profile-data'
I am interested in the incomeTypes, eg. "SALARY", "PENSIONS", "OTHER", etc.. and ran this query changing jsonData.incometype each time:
SELECT jsonData
FROM "sampledb"."profiles"
CROSS JOIN UNNEST(sampledb.profiles.profile.primaryApplicant.incomes) AS la(jsonData)
WHERE jsonData.incometype='SALARY'
This worked fine with CROSS JOIN UNNEST which flattened the incomes array so that the data example above would span across 2 rows. The only idiosyncratic thing was that CROSS JOIN UNNEST made all the field names lowercase, eg. a row looked like this:
{amount=1520, incometype=SALARY, frequency=FORTNIGHTLY}
Now I have been asked how many users have two or more "SALARY" entries, eg.
"incomes": [
{ "amount": 3000, "incomeType": "SALARY", "frequency": "FORTNIGHTLY" },
{ "amount": 4000, "incomeType": "SALARY", "frequency": "MONTHLY" }
],
I'm not sure how to go about this.
How do I query the array of structures to look for duplicate incomeTypes of "SALARY"?
Do I have to iterate over the array?
What should the result look like?
UNNEST is a very powerful feature, and it's possible to solve this problem using it. However, I think using Presto's Lambda functions is more straight forward:
SELECT COUNT(*)
FROM sampledb.profiles
WHERE CARDINALITY(FILTER(profile.primaryApplicant.incomes, income -> income.incomeType = 'SALARY')) > 1
This solution uses FILTER on the profile.primaryApplicant.incomes array to get only those with an incomeType of SALARY, and then CARDINALITY to extract the length of that result.
Case sensitivity is never easy with SQL engines. In general I think you should not expect them to respect case, and many don't. Athena in particular explicitly converts column names to lower case.
You can combine filter with cardinality to filter array elements having incomeType = 'SALARY' more than once.
This can be further improve so that intermediate array is not materialized by using reduce (see examples in the docs; I'm not quoting them here, since they do not directly answer your question).