counting SUM vote by member in a large data. I want to show SUM vote all memember in a Event (in this question I not use Event model) .
TO GET ALL MEMBER SUM VOTE
First I using map/reduce to find sum vote each member. #=> name result: 'sum_vote'
Then I insert 'member a' with 6 new votes and 'member b' with 2 new votes.
I not want to read all data again because it too large.
I try:(IN 2 CASE I USE CACHE Object RESULT)
run map/reduce with option query is member_id = member_a.id AND option out:{merge:'sum_vote'}. And member_b do it, too
when 1 vote insert , I update result 'sum_vote' by find by key _id(='member_id' ) then +1 to it value.
How are you do it? Please share me some issue. THanks :D
In map/reduce have output type: 'reduce' . It exactly what I need.
Just run second map/reduce with option:query and set out:{reduce: 'sum_vote'}
Thank for your attention
Related
I got one question for SQL query (sql server) for which I am very confused what they want to me display in the first place. This is one exam question, so I do not believe it's a mistake and unfortunately I cannot ask for clarification.
Question:
"Write a query that has the following columns:
Transaction ID, Stock type.
Group the rows according to the portfolio."
My current query will be:
select p.prfid 'Portfolio', t.trdnbr 'Transaction ID', s.instype 'Stock Type'
from transaction t, stock s, portfolio p
where t.insaddr = s.insaddr
and t.prfnbr = p.prfnbr
One portfolio can have multiple transactions, but I am confused because they are not looking for number of transactions per portfolio (in which way "group by" will have total sense), so I am not sure in which way I should group them (except of simple sorting).
Also I was wondered if they expected me to write portfolio name in the where clause in order to group similar transactions... I don't know I am very confused.
Please help at least in which way you think they want me to present the result.
My current sample output for better view of data (it DOES NOT mean it should look like that))
Portfolio Transaction ID Stock Type
Cash 1001 Deposit
Cash 1002 Currency
Crypto 1003 Stock
Thanks
I am trying to replicate the Google Analyitcs data in Big Query but couldnt do that.
Basically I am using Custom Dimension 40 (user subscription status)
but I am getting wrong numbers in BQ.
Can someone help me on this?
I am using this query but couldn't find it out the exact one.
SELECT
(SELECT value FROM hits.customDimensions where index=40) AS UserStatus,
COUNT(hits.transaction.transactionId) AS Unique_Purchases
FROM
`xxxxxxxxxxxxx.ga_sessions_2020*` AS GA, --new rollup
UNNEST(GA.hits) AS hits
WHERE
(SELECT value FROM hits.customDimensions where index=40) IN ("xx001","xxx002")
GROUP BY 1
I am getting this from big query which is wrong.
I have check out the dates also but dont know why its wrong.
Your question is rather unclear. But because you want something to be unique and numbers are mysteriously not what you want, I would suggest using COUNT(DISTINCT):
COUNT(DISTINCT hits.transaction.transactionId) AS Unique_Purchases
As far as I understand, you imported Google Analytics data into Bigquery and you are trying to group the custom dimension with index 40 and values ("xx001","xxx002") in order to know how many hit transactions were performed in function of these dimension values.
Replicating your scenario and trying to execute the query you posted, I got the following error.
However, I created a query that could help with your use-case. At first, it selects the transactionId and dimension values with the transactionId different from null and with index value equal to 40, then the grouping is done by the dimension value, filtered with values equals to "xx001"&"xxx002".
WITH tx AS (
SELECT
HIT.transaction.transactionId,
CD.value
FROM
`xxxxxxxxxxxxx.ga_sessions_2020*` AS GA,
UNNEST(GA.hits) AS HIT,
UNNEST(HIT.customDimensions) AS CD
WHERE
HIT.transaction.transactionId IS NOT NULL
AND
CD.index = 40
)
SELECT tx.value AS UserStatus, count(tx.transactionId) AS Unique_Purchases
FROM tx
WHERE tx.value IN ("xx001","xx002")
GROUP BY tx.value
For further details about the format and schema of the data that is imported into BigQuery, I found this document.
I'm having trouble with a part of the following question. Thank you in advance for your help. I have a hard time visualizing this "fake" database table. I was hoping someone could help me run through my logic and see if it's correct. If someone could just point me in the right direction that would be great!
About:
Sesame is a way to find online class for adults & activities for adults around you.
Imagine a database table named activities. It has four columns:
activity_id [int, non null]
activity_provider_id [int, non null]
area_id [int, nullable]
starts_at [timestamp, non null]
Question: Given the following query, which counts would you expect to return the highest and lowest values? Which counts would you expect to be the same? Why?
select
count(activity_id),
count(distinct activity_provider_id),
count(area_id),
count(distinct area_id),
count(*)
from activities
My Solution
Highest values: count(*)
Reasoning: The Count(*) function returns the number of rows returned by a SELECT statement, including NULL and duplicates.
Lowest values: count(distinct activity_provider_id)
Reasoning: Less activity providers per activity per area*
Same: Unsure - Could someone just point me in the right direction?
count(*) takes in account all rows in the table, while count(some_col) only counts non-null values of some_col.
Since activity_id is a non nullable colum, one would expect the following expressions return the same, "highest" count:
count(activity_id)
count(*)
As for wich expression returns the lowest count out of the three remaining choices, it is not really possible to tell for sure from the information provided in the question. If actually depends whether that are more, or less, distinct areas than activity providers.
There even is an edge case where all expressions return the same case, if all activity providers (resp. areas) are not null and unique in the table.
As part of my course in university I have to make a database in Microsoft Access which is somewhat limiting me on what I'm trying to do. I have a table that has the information of whether a player in a team was present for a fixture or not using the values "P", "R", and "M" (Played, Reserves, Missed). I want to make a query that counts a value of 1 for each value of P or R and a separate one for M, so that when I make a report that prints off a membership card, it shows the amount of fixtures they've played in and the amount of fixtures that they have missed.
Sorry if this isn't clear, I'll try to explain myself further if you ask but I'm not very good with this stuff. Thank you.
Edit: I'll use screenshot links if that's okay, here is the Fixture Attendance entity that shows if a member of a team attended a game or not. I'm making a membership card based off this one. I want to be able to display the No. of fixtures played by the member and the No. of fixtures missed based off the values in the above entity and use that information in a form I'm going to create. That will be a subform inside my Membership Card form.
I'm presumably really bad at explaining this - I understand Access is rarely used in the real world so I'm not sure why I'm doing this in the first place and don't feel like I'm getting any real knowledge of working with databases.
You should use the COUNT function.
http://office.microsoft.com/en-us/access-help/count-data-by-using-a-query-HA010096311.aspx
I am guessing that you want something like this:
select playerid, sum(iif(fixture in ("P", "R"), 1, 0)) as NumPR,
sum(iif(figure = "M", 1, 0)as NumM
from table t
group by playerid;
The key idea here is putting the conditional part (iif()) inside the sum().
CASE WHEN can be used to translate the codes into 1's and 0's. Then use SUM with a GROUP BY to sum them.
SELECT player_id, SUM(Played), SUM(Reserve), SUM(Missed)
FROM
(SELECT player_id,
CASE WHEN present = 'P' THEN 1 ELSE 0 AS Played,
CASE WHEN present = 'R' THEN 1 ELSE 0 AS Reserve,
CASE WHEN present = 'M' THEN 1 ELSE 0 AS Missed
FROM fixtures)
GROUP BY player_id;
I'm trying to write an SQL (Windows server) query that will provide some results based on what other users like.
It is a bit like on Amazon when it says 'Users who bought this also bought...'
It is based on the vote field, where a vote of '1' means a user liked a record; or a vote of '0' means they disliked it.
So when a user is on a particular record, I want to list 3 other records that users who liked the current record also liked.
snippet of relevant table provided below:
ID UserID Record ID Vote DateAdded
16 9999 12013011290 1 2008-11-11 13:23:44.000
17 8888 12013011290 0 2008-11-11 13:23:44.000
18 7777 12013011290 0 2008-11-11 13:23:44.000
20 4930 12013011290 1 2013-11-19 15:04:06.263
I think this requires ordering by a sub-select, but I'm not sure. Can anyone advise me on if this is possible and if so how! thanks.
p.s.
To maintain the quality of the results I think it would be extra useful to filter by DateAdded. That is,
- 'user x' is seeing recommended records about 'record z'
- 'user y' is someone who has liked 'record z' and 'record a'
- only count 'user y's' like of 'record a' IF they liked 'record a' an HOUR before or after they liked 'record z'
- in other words, only count the 'record a's' like if it was during the same website-browsing session as 'record z'
Hope this makes sense!
something like this?
select r.description
from record r
join (
select top 3 v.recordid from votes v
where v.vote = 1 and recordid != 123456789
and userid in
(
select userid from votes where recordid = 123456789 and vote =1
)
order by dateadded desc
) as x on x.recordid = r.id
A method I used for the basic version of this problem is indeed using multiple selects: figure out what users liked a specific item, then query further on what they tagged.
with Likers as
(select user_id from likes where content_id = 10)
select count(user_id) as like_count, content_id
from likes
natural join likers
where content_id <> 10
group by content_id
order by like_count desc;
(Tested using Sqlite3)
What you will receive is a list of items that were liked by everyone who liked item 10, ordered by the number of likes (within the search domain.) I would probably want to limit this as well, since on a larger dataset its likely to result in a large amount of stray items with only one or two similar likes that are in turn buried under items with hundreds of likes.
I suspect the reason you are checking timestamps in the first place is so that if somebody likes laundry detergent, then comes back two days later to like a movie, the system would not associate "people who like Epic Shootout 17 also like Clean More."
I would not recommend using date arithmetic for this. I might suggest creating another table to represent individual "sessions" and using the session_id for this task. Since there are (hopefully!) many, many like records on your database, you want to reduce the amount of work you are making it do. You can also use this session_id for logging any other actions a person did (for analytics purposes.) It is also computationally cheaper to ask for all things that happened within a session with a simple index and identity comparison than to perform date computations on potentially millions of records.
For reference, Piwik defines a new session as thirty minutes since the last action taken.