Pandas GBQ and Python 3 - pandas

Is the latest release of Pandas Google BigQuery (currently 0.16.2) module compatible with Python 3? I know at one point it wasn't but not sure that update has been made? If not, are there any plans to have Pandas GBQ modules support Python 3?
J

With the addition of several new gbq features and an overhaul to the way testing was done here. This should be pretty easy to add support (really just changing the minimum required api-client version).
here is an issue to track this feature. pull-requests are welcome!

Related

Is there a version of Pandas that will work with MicroPython?

I have a program that utilizes Pandas and Numpi which I want to operate on an ESP32 micro controller. Is there a version of Pandas and Numpi that will work on the ESP32 so I can run the existing code program?
If yes, does anyone know of any tutorials or documentation I can reference to learn? I know enough about Python and micro controllers to be dangerous. So please dumb it down 🤣.
About Numpy, I think the most advance library is
ulab

What kotlin package gives numpy capabilities?

I would like to convert python code that uses numpy to Kotlin. Which package is recommended to use that delivers most (or all) of numpy capabilities?
Thanks
See: https://github.com/Kotlin/multik
For more information:
https://blog.jetbrains.com/kotlin/2021/02/multik-multidimensional-arrays-in-kotlin/
Multik Architecture
Initially, we attempted to add Kotlin bindings to existing solutions,
such as NumPy. However, this proved cumbersome and introduced
unnecessary environmental complexity while providing little benefit to
justify the overhead. As a result, we have abandoned that approach and
started Multik from scratch.

Has anyone found away to call Bloomberg BQL API using Pdblp or another package with Python?

The BQL works in Excel using what appears to be the same API add-in using the same fields to call the Bloomberg data, i.e PX_LAST. I currently run models in python using pdblp, that works great and I would love to move to the BQL version of the API to optimize data usage outside of the terminal. Is anyone aware of any effort to utilize the BQL in any package? I know someone asked about this last year... looking for an update.
Quite old but maybe someone comes along this thread...
I talked to BBG recently and they have no intention to open the BQL API for Python. They couldn't give me a reason why, when I asked but I guess they want to promote their internal BQNT environment.
I use:
"from xbbg import blp"
This seems to work pretty well.
The syntax is very similar to what one would use in excel e.g.
df = blp.bdh(tickers1, flds=fields,
start_date='2017-01-01', end_date='2023-01-25')
There is also:
"import blpapi"
But I find that less user friendly. You have to initate sessions etc.

Choosing btw Scrapy V1.8 and V2.0

Hi I have wanted to learn Scrapy and use it to scrape dynamic data off of webpages, and use it in a website backend.
When I went to the official docs, I go to know that V.2.0 just came out.
Given that I'm new to scrappy, and plan to develop an autonomous hosted application, I was wondering whether I should choose v 1.8 over v 2.0 because bugs would've been worked out better and there'll be more tutorials etc. But on the other hand, I'll end up learning 2.0 anyway in the future, so maybe I should start with 2.0 itself.
So I have two questions:
Are there any major changes from v1.8 to v.2.0 (I am aware that there are release notes that accompany each version, but the only thing that I can really understand is that Python 2 support was removed; everything else uses terminology that I don't understand.)
I'd be grateful for your advice on which one I should opt for.
I have worked with Selenium & BeatifulSoup4 on 1 project before hand, which involved scraping stock price and relative strength index, and using that as a part of Flask backed web app.
Always use the latest Scrapy release for a new project, unless you cannot for some reason.
There are no major changes in how Scrapy works between 1.8 and 2.0; upgrading from 1.8 to 2.0 should be as easy as upgrading from 1.7 to 1.8.

Why we need Hadoop distributions?

I am new to Hadoop. So, please can anybody explain to me why we need cloudera or Hortonworks? We can download each Apache project and use those libraries to create Big Data project, right? And, also if I already use linux OS, do I have to use cloudera-quickstart vm ware? Thanks in advance.
Lets look at this in using a similar analogy.
Lets assume you are using OS 'D' of version 'v1'. In it you need different set of libraries - A,B and C.
A depends on B and also C depends on B. Across the versions of A and C, the dependencies are different versions of C.
Now if you need all the three libraries, it becomes your head ache to make sure you use/install libraries of each such that each are compatible and there's no clash.
Plus not everyone is expert in all the three libraries as well as the underlying system. So what happens if there some optimization needed in using these libraries while using them in your own tools? Also what about some issues that you face while using them.
That's where these "Stack Distributions" come into play. Each of these vendors provide a complete stack which is tested as a whole and are compatible with the different libraries that are packaged and not just only hadoop. This makes lives of lots of people easier. Also based on what plan or subscription you have with the vendor, you can get support/training and other auxiliary things.
Just to add as an extra, please remember, Open Source does not necessarily mean Free.(Please note that this is my personal opinion)
As to your other part of question wrt with linux do you need to use any vm ware image or so, for a beginner or learning purposes, this makes your life rather simpler.