How to move older clickhouse partitions to S3 disk [closed]

How to move older clickhouse partitions to S3 disk [closed] - amazon-s3

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
Improve this question
I'm currently starting to work with clickhouse for our in-house analytics system, but it looks like there are no automated ways to configure policies for data retention. The only thing I saw was the ALTER ... MOVE PARTITION (https://clickhouse.tech/docs/en/sql-reference/statements/alter/partition/#alter_move-partition), but it looks like the process has to be manual / implemented in our application layer.
My objective is to move data older than 3 months directly to an S3 cluster for archival and price reasons, while still being able to query it.
Is there any native way to do so directly in clickhouse with storage policies?
Thanks in advance.

This answer was based out of #Denny Crane's comment: https://altinity.com/blog/clickhouse-and-s3-compatible-object-storage, where I did put comments where there were not enough explanations, and keeping it in the event that the link dies.
Add your S3 disk to a new configuration file (Let's say /etc/clickhouse-server/config.d/storage.xml:
<yandex>
<storage_configuration>
<disks>
<!-- This tag is the name of your S3-emulated disk, used for the rest of this tutorial -->
<your_s3>
<type>s3</type>
<!-- Set this to the endpoint of your S3-compatible provider -->
<endpoint>https://nyc3.digitaloceanspaces.com</endpoint>
<!-- Set this to your access key ID provided by your provider -->
<access_key_id>*****</access_key_id>
<!-- Set this to your access key Secret provided by your provider -->
<secret_access_key>*****</secret_access_key>
</your_s3>
</disks>
<!-- Don't leave this file yet! We still have things to do there -->
...
</storage_configuration>
</yandex>
Add a storage policy for your data storage:
<!-- Put this after the three dots in the snippet above -->
<policies>
<shared>
<volumes>
<default>
<!-- Default is the disk that is present in the default question -->
<disk>default</disk>
</default>
<your_s3>
<disk>your_s3</disk>
</your_s3>
</volumes>
</shared>
</policies>
Once that is done, you can create your tables with the following insert statement:
CREATE TABLE visits (...)
ENGINE = MergeTree
TTL toStartOfYear(time) + interval 3 year to volume 'your_s3'
SETTINGS storage_policy = 'shared';
Where shared is your policy name, and your_s3 is the name of your disk in that policy.

Related

REST URL convention [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I have an application table that has columns like quoteId, accountNumber, and few others. I have created a REST endpoint to update the account number on the basis of quoteId i.e Update account no. in the application that has quoteId = {quoteId}. Here is the endpoint:
PUT /applications/quotes/{quoteId}/accountNumber
Is it the correct REST convention?

Is it the correct REST convention?
Maybe.
If your PUT/PATCH/POST request uses the same URI as your GET request, then you are probably on safe ground. If PUT/PATCH/POST are using different URI, then something has Gone Wrong somewhere.
In other words, if /applications/quotes/{quoteId}/accountNumber is a resource that you link to, then it is the right idea that you send unsafe requests to that URI.
But if accountNumber is information normally retrieved via /applications/quotes/{quoteId}, then /applications/quotes/{quoteId} should be the target resource for edits (instead of creating a new resource used for editing only).
The reason for this is cache-invalidation, as explained in RFC 7234.
If this isn't immediately clear to you, then I suggest reviewing Jim Webber's 2011 talk on REST.

You should use PATCH instead of PUT for partial object updates.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/PATCH
https://www.infoworld.com/article/3206264/how-to-perform-partial-updates-to-rest-web-api-resources.html

In my opinion, your url must be:
PUT /applications/quotes/{quoteId}
payload: {
accountNumber: <number>,
... any other field...
}
Because you only want to update a part of an object of the list of quotes that you identify uniquely with quoteId.
About the use of PUT or PATCH, it's true that PUT means that you want to keep the object as the updated copy that you are sending (you must send in this case the entire object to make the update) but the fact is (I think) that many of us use PUT like you, to do partial updates of object.

Mx record for subdomain [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
My domain mybasiccrm.com is hosted on hostgator.com
The subdomain tr1.mybasiccrm.com is hosted on tr8.mybasiccrm.com
I have created an MX record on the server tr8 for the domain tr1.mybasiccrm.com but when I check this http://mxtoolbox.com/SuperTool.aspx?action=mx%3atr1.mybasiccrm.com&run=toolpage
it says that "No Records Exist"
How can I have a proper mx recort for tr1.mybasiccrm.com ?
PS: I can send an email from my gmail account to the address email#tr1.mybasiccrm.com without a problem.
Thank you all!

As this is sub domain of TLD: mybasiccrm.com, make sure that you are adding MX for your subdomain “tr1.mybasiccrm.com” in mybasiccrm.com's DNS zone as long as you do not have separate DNS zone of your sub domain.
→ Collect exact MX value, you need to set for tr1.mybasiccrm.com.
→ Open DNS zone of mybasiccrm.com and add MX record:
Record type: MX
Record name: tr1
Priority: 0
Value: MX value of tr1.mybasiccrm.com
Once it is done, check in any online tool

Sitemap re-submission for dynamic website [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I am very new to SEO and trying to explore it reading all day. I have finally created my dream carpooling site and working on the seo aspect now. I want do it on my own so I get to know about SEO better.Its is dynamic website. User post their trip details every day like given below ( IN quotes)
site : www.shareurride.com.au
sitemap: www.shareurride.com.au/sitemap.xml
Trip detail page :
http://www.shareurride.com.au/ridedetails.php?id=MjY3&tripdate=MjAxMy0wNy0wNQ,,
http://www.shareurride.com.au/ridedetails.php?id=MTY2&tripdate=MjAxMy0wNy0wNQ,,
( trips like this will be added regularly everyday)
I have already have a program which dynamically insert this into my sitemap.
My main question is , is that I need to resubmit my site to GOOGLE every day or will it do on its own. ?
Trip detail page is the only page will be dynamically added to sitemap. Please let me know . If I need to resubmit the page regularly ,is there any tools to do that ?
Thanks

is that I need to resubmit my site to GOOGLE every day or will it do on its own. ?
No. Once Google knows where to find your sitemap they will continue to recrawl it periodically.

Is there an API to get a list of gmail filters and update them? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Google Apps has an "Google Apps Email Settings API" which allows to create a new mail filter via an API call.
Is there any (perhaps undocumented) way to get the list of current filters and update them?

A Filter object was added to the API that allows for filter processing, including retrieval, creation and deletion.
https://developers.google.com/gmail/api/guides/filter_settings
Specifically:
Listing Filters
GET https://www.googleapis.com/gmail/v1/users/userId/settings/filters
Returns a JSON list of Filter objects
Retrieving a specific Filter
GET https://www.googleapis.com/gmail/v1/users/userId/settings/filters/id
Returns a single JSON Filter object
Deleting a specific Filter
DELETE https://www.googleapis.com/gmail/v1/users/userId/settings/filters/id
Creating a Filter
POST https://www.googleapis.com/gmail/v1/users/userId/settings/filters
With a JSON encoded Filter in the request body.
While the REST URLs have v1 in the address, they are linked from the current documentation. Also note, GMail API migration is currently in progress, and the deprecated API will cease to function as of July 2016. Keep this in mind, as the API may change.

No. There is no API to retrieve filters, only create new ones (as you found).
However, users can export all of their filters from the UI and re-import them into another account manually:
Using Filters

I haven't tried it, but according to the Google Admin SDK docs, it looks like you can:
https://developers.google.com/admin-sdk/email-settings/#retrieving_labels

Testing IP based geolocation [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
We are implementing an IP based geolocation service, and we need to find some IP's from various markets (LA, NY etc) to fully test the service.
Does anybody know of a directory where we could find what IP ranges are used where?
EDIT: We have already implemented the system, it uses a 3rd party DB and a webservice. We just want some IP's from known markets to verify its working properly.
I'm going to see if I can get what I need from the free maxmind database.

Not sure if cost is a factor but there are a few open source databases knocking about. This one claims 99.3% accuracy on its free version with 99.8% for its paid version. They've also got a Free & Open Source City Database (76% accuracy at city level).
They're both available as CSV-based databases so you can easily take a known location and get an IP range for ISPs in the area.
The tougher part is getting access to a computer in that IP range.

Try looking for sites providing lists of anonymizers. They usually list the countries for the anonymizer sites. Then either use the IP provided or do a lookup on the anonymizer name.
Also try searching for lists of anonymous proxies.
We trawled the logs for our huge web site and built up a test collection.
Sorry I can't pass it on. )-:
cheers,
Rob

maybe this database will be useful for you:
http://www.hostip.info/dl/index.html
it's a collection of ip adresses with countries and cities.

Many open source projects have worldwide mirrors; you can find a country-indexed list of Debian mirrors and kernel.org mirrors. (Note that kernel.org specifically has many mirrors per country; there are eleven United States mirrors, which are located in different regions of the country and would give different information.)

You could try using an automation tool, such as AutoIT, to fire off a series of IP addresses at a whois database service such as arin or RIPE, and harvest the addressed responses, probably just varying the first two parts of the IP.

Use Tor with a strict exit node.
You'll need to use these options in your config:
ExitNodes server1, server2, server3
StrictExitNodes 1
You'll also need to identify exit nodes that work for you in the region that you want. I suggest using the Search Whois feature at ARIN to see it's location if the Tor country icon isn't good enough. It can be a bit of a pain to identify working Tor nodes in each region that you wish to test, but it's possible and free.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to move older clickhouse partitions to S3 disk [closed] - amazon-s3

Related

REST URL convention [closed]

Mx record for subdomain [closed]

Sitemap re-submission for dynamic website [closed]

Is there an API to get a list of gmail filters and update them? [closed]

Testing IP based geolocation [closed]

Categories

Resources