Google BigQuery Cymbal Group data sets - google-bigquery

This is not so much a programming question as it is a public data set question. Please let me know if there is a more appropriate venue to ask this.
I have been trying to find out more information about this data set:
https://console.cloud.google.com/marketplace/product/cymbal/cymbal
About Cymbal: Google Cloud's demo brand
Cymbal Group
Synthetic datasets across industries showcasing Google Cloud.
​ I can not see it when I use the Explorer to browse
bigquery-public-data. I can see the cymbal_investments dataset but not
the one described above.
I am especially interested in the Retail Subsidiaries data such as:
Cymbal Superstore — An American superstore and grocer with a
multinational presence.
Cymbal Shops — An American retail chain selling homewares,
electronics, and clothing.
Cymbal Direct — An online direct-to-consumer Chicago-Based footwear
and apparel retailer.
Please let me know if you can point me to the right dataset.
Thanks for any suggestions.

Related

How to get access to this paper?

So I am doing my own research, and I need to read this paper.
CALVIN, T. W. (1977). "TNT Zero Acceptance Number Sampling." ASQC Technical Conference Transactions, Philadelphia, PA.
https://hero.epa.gov/hero/index.cfm/reference/details/reference_id/8389081
However, I checked on google scholar, I used the acedamic VPN, I searched on my university library,
none of them have the access of this paper.
I REALLY need this paper, and I do not live in the united stated, so I can not go to the library there.
Is there any chance you know how to get access to this?
Thank you so so much!

Finding out the people and book capacity for a new library using analysis

A friend was asked this question which I would like to know what would be an ideal response for this question.
If you were asked to do analysis about how many people and book capacity should a new public library have in the area you live in, and what should be the location for such a library, what are the inputs you would need to perform the analysis, what are the key factors that would need to be considered? How would you test whether your analysis was accurate?
The analysis which I did were as follows: For the book and people capacity as well as for the location:
1 For the location, questions such as: Where are the cities most populated educational institutes located? Where do most college students go after their college/tuitions? Which locations are entertainment hotspots that attract people? For example, Malls, Shopping centers, etc.? Which location is most connected to other parts of the city via different public transport routes like bus, train, and metro?
I think so answering this, would give an idea about the location for where the library should be established.
Coming to the people and book capacity, Well, including the students that are present in the given area, can one estimate an average of the student population for a reference number? (Though I think it doesn't qualify as a correct method?) Can also take an example of an existing library somewhere else and find out how many people do visit it and get a rough estimated number? (This can also be done for getting the book capacity) Finally, I think we can also get a number on how many working professionals, residents and students inhabit the said location and estimate a range on how many people will visit the library?
For the book capacity, A number of books would be educational books depending upon the educational backgrounds of the majority of students, which could be estimated by the number of residing students present there. Besides educational books, the number of fictional and non fictional books could be the average amount of best sellers present in a given month. And lastly, again we can compare with existing libraries which are present in the given vicinity.
For testing our hypothesis, the only way I'm getting is by conducting surveys or by asking people first hand in a QnA interview format near college areas. Any other suggestions?

Structured Data Schema Types for Trading Card Game Cards

So, I run a website which has a card database for the old Star Wars Trading Card Game by Wizards of the Coast. swtcg.com There are multiple sets/expansions and each of those has multiple cards.
If you google other trading card games like Magic the Gathering or Pokemon TCG, you SOMETIMES will get rich, carousel-style results for individual cards, and if you click one of the cards, you get the rich, graph sidebar result. It seems like google is aware that these are Cards from Sets for a Trading Card Game.
I have tried to search for sites that are using structured data to identify these types, but have only found one or two, and they are just using Product markup.
Does anyone have any advice for what types I should use? I would really like to get to the point where you could search for a card and could get a rich result on the side with details about each card.
I've tried Product, but only some of them are cards that are actually sold. Others are digital only and free. I've considered Article and Creative Work, but am just really stumped as to what the best options would be for me. Is there such a thing as custom types that aren't insanely difficult to implement?
The contents of your website present a database for playing cards. Let's look at your web page representing one card 100 Battle Droids. In my humble opinion, this content is explicit Creative Work and this type can probably help you. Due to the fact that the subject of this web page is a game, the use of the embedded type Game can help you. For this type, you can use the about or mainEntity properties as alternatives.
The map that is presented in the content of this web page is an image. You can probably be helped by using the following Google guidelines for structured data for the Article type:
For best results, provide multiple high-resolution images (minimum of
300,000 pixels when multiplying width and height) with the following
aspect ratios: 16x9, 4x3, and 1x1. For example:
{
"#context": "https://schema.org",
"#type": "NewsArticle",
"image": [
"https://example.com/photos/1x1/photo.jpg",
"https://example.com/photos/4x3/photo.jpg",
"https://example.com/photos/16x9/photo.jpg"
]
}
You can use the free online calculator for the following aspect ratios:
4X3 and 16x9. To compress your images, you can search for image compressors on the web. I usually use Compressnow with the maximum level of 65%.
Using the Google Guides to optimize your images Google Image best practices and UX to responsive images.
Your information below the card is the table. The use of a responsive table (row only) for this data may probably help. You can use the W3 guide Generating JSON from Tabular Data on the Web to structure this data.
You can use Google guide to Dataset and the standard of W3 Data Catalog Vocabulary (DCAT) - Version 2 to create a database of your cards.

Number of results google (or other) search programmatically

I am making a little personal project.
Ideally I would like to be able to make programmatically a google search and have the count of results. (My goal is to compare the results count between a lot (100000+) of different phrases).
Is there a free way to make a web search and compare the popularity of different texts, by using Google Bing or whatever (the source is not really important).
I tried Google but seems that freely I can do only 10 requests per day.
Bing is more permissive (5000 free requests per month).
Is there other tools or way to have a count of number of results for a particular sentence freely ?
Thanks in advance.
There are several things you're going to need if you're seeking to create a simple search engine.
First of all you should read and understand where the field of information retrieval started with G. Salton's paper or at least read the wiki page on the vector space model. It will require you learning at least some undergraduate linear algebra. I suggest Gilbert Strang's MIT video lectures for this.
You can then move to the Brin/Page Pagerank paper which outlays the original concept behind the hyperlink matrix and quickly calculating eigenvectors for ranking or read the wiki page.
You may also be interested in looking at the code for Apache Lucene
To get into contemporary search algorithm techniques you need calculus and regression analysis to learn machine learning and deep learning as the current google search has moved away from Pagerank and utilizes these. This is partially due to how link farming enabled people to artificially engineer search results and the huge amount of meta data that modern browsers and web servers allow to be collected.
EDIT:
For the webcrawler only portion I'd recommend WebSPHINX. I used this in my senior research in college in conjunction with Lucene.

Visualization data gathering for learning

I'm just starting to take an interest in visualization and I'd like to know where I can get my hands on some data, preferably real world, to see what queries and graphics I can draw from it. Its more of a personal exercise to create some pretty looking representations of that data.
After seeing this I wondered where the data came from and what else could be done from Wikipedia. Is there anyway I can obtain data from say, wikipedia?
Also, could anyone recommend any good books? I don't trust the user reviews on the amazon website :-)
You can download the raw Wikipedia data from http://download.wikimedia.org. There are many different views of the data available. The English Wikipedia is by far the largest database, and there isn't a current full dump available, but one is in progress. It will probably take months to finish and be available for download.
The most recent one was 18 GB compressed, which uncompressed to something like 2.5 TB.
A fantastic book is The Visual Display of Quantitative Information by Edward Tufte.