how to fix the problem of downloading fasttext-model300? - text-mining

I'm using windows 10 and python 3.3. I tried to download fasttext_model300 to calculate soft cosine similarity between documents, but when I run my python file, it stops after arriving at this statement:
fasttext_model300 = api.load('fasttext-wiki-news-subwords-300')
There are no errors or not responding, It just stops without any reaction.
Does anybody know why it happens?
Thanks

I'd recommend against using the gensim api.load() functionality. It dynamically runs new, unversioned source code from remote servers – which is opaque in its operations & suboptimal for maintaining a secure local configuration, or debugging any issues which occur.
Instead, find the actual exact data files you trust and download them as plain data. Then, use specific library operations, like the KeyedVectors.load_word2vec_format() method, so instantiate exactly the model you need, using precise local-file paths you understand.
Following those steps may make it clearer what, if anything, is going wrong. If it doesn't, try also enabling logging at the INFO level to gather more information about what progress is made before failure (and add any new details as a comment or to your question).

python3 -m gensim.downloader --download fasttext-wiki-news-subwords-300
Try using this. Source : https://awesomeopensource.com/project/RaRe-Technologies/gensim-data

Related

How can a modified Julia package be used natively?

So, there is this cool package I've found but it leaves a lot to be desired. Since it made more sense to modify it, rather than build a new one myself, I changed the code in the corresponding source directory (C:\Users[my username].julia\v0.4[package name]\src). I made sure to modify not just the base.jl file, but also the [name of package].jl one so that there are no issues with dependencies or the new functions I added. I tried running the package several times to ensure that Julia doesn't spit out any errors or exceptions (the original package had some deprecated stuff, which I also remedied). Still, I fail to use the additional functionality of the package that I augmented. Any help would be greatly appreciated.
I'm using Julia ver 0.4.2, on a Windows 7 machine. As an IDE I use Notepad++. Thanks
I'm not exactly sure what you tried, but here's a guess as to what's going on: if you've already loaded the package in your julia session, edits to the source files won't take effect unless you explicitly reload the package. There are some good workflow tips here, and more explanation of the module system here.
However, for a newbie the easiest thing might be to quit julia and restart.
As far as making changes to a package, as Gnumic commented, your best approach is to make a branch and commit your changes there. Once you become convinced your changes represent an improvement, consider sharing your changes with the rest of the world.

libtls: select() and tls_read() working together

I want to add an ssl support to an old chat application I wrote years ago. I did a lot of reading on OpenSSL and LibreSSL and I decided to try a new libtls API. I think developers did a really great job on this one.
I found it to be very easy to use - almost no changes to my existing code where required. But here is one thing I need to figure out now:
Back in a day, I was using select() to monitor sockets and recv() to read a data. This was easy, because both of those functions are working on file descriptors.
Now, with libtls, function tls_read() requires a tls context as a first argument. This means I need to search the list of clients to get an appropriate tls context every time I have a descriptor ready to be read. This is not that hard but maybe someone knows a better solution? I will appreciate all comments and code samples.
Unless I'm misreading the documentation, it seems to me that if you create the sockets yourself, and then use tls_connect_fds/tls_connect_socket/tls_accept_fds/tls_accept_socket afterwards, you'll have normal file handles available you can trivially use with select()/poll()/etc. You'd still need to keep around some sort of file descriptor to context mapping to actually issue the tls_read/tls_write once you were ready, but that's just your choice of linked list or hashtable, depending on what language you're using and what stdlib you have available.

Can't eliminate Access corruption

My firm's Access database has been having some serious problems recently. The errors we're getting seem like they indicate corruption -- here are the most common:
Error accessing file. Network connection may have been lost.
There was an error compiling this function.
No error, Access just crashes completely.
I've noticed that these errors only happen with a compiled database. If I decompile it, it works fine. If I take an uncompiled database and compile it, it works fine -- until the next time I try to open it. It appears that compiling the database into a .ACCDE file solves the problem, which is what I've been doing, but one person has reported that the issue returned for her, which has me very nervous.
I've tried exporting all of the objects in the database to text, starting with a brand new database, and importing them all again, but that doesn't solve the problem. Once I import all of the objects into the clean database, the problem comes back.
One last point that seems be related, but I don't understand how. The problem started right around the time that I added some class modules to the database. These class modules use the VBA Implements keyword, in an effort to clean up my code by introducing some polymorphism. I don't know why this would cause the problem, but the timing seems to indicate a relationship.
I've been searching for an explanation, but haven't found one yet. Does anyone have any suggestions?
EDIT: The database includes a few references in addition to the standard ones:
Microsoft ActiveX Data Objects 2.8
Microsoft Office 12.0
Microsoft Scripting Runtime
Microsoft VBScript Regular Expressions 5.5
Some of the things I do and use when debugging Access:
Test my app in a number of VM. You can use HyperV on Win8, VMWare or VirtualBox to set up various controlled test environments, like testing on WinXP, Win7, Win8, 32bit or 64 bits, just anything that matches the range of OS and bitness of your users.
I use vbWatchDog, a clever utility that only adds a few classes to your application (no external dependency) and allows you to trap errors at high level, and show you exactly where they happen. This is invaluable to catch and record strange errors, especially in the field.
If the issue appears isolated to one or a few users only, I would try to find out what is special about their config. If nothing seems out of place, I would completely unsintall all Office component and re-install it after scrubbing the registry for dangling keys and removing all traces of folders from the old install.
If your users do not need a complete version of Access, just use the free Access Runtime on their machine.
Make sure that you are using consistent versions of Access throughout: if you are using Access 2007, make sure your dev machine is also using that version and that all other users are also only using that version and that no components from Access 2010/2013 are present.
Try to ascertain if the crash is always happening around the same user-actions. I use a simple log file that I write to when a debugging flag is set. The log file is a simple text file that I open/write to/close everytime I log something (I don't keep it open to make sure the data is flushed to the file, otherwise when Access crashes, you may only have old data in the log file as the new one may still be in the buffer). Things I log are, for instance, sensitive function entry/exit, SQL queries that I execute from code, form open/close, etc.
As a generality, make sure your app compiles without issue (I mean when doing Debug > Compile from the IDE). Any issue at this stage must be solved.
Make absolutely sure you close all open recordsets, preferrably followed by setting their variables to Nothing. VBA is not as sensitive as it used to be about dangling references, but I found it good practice, especially when you cannot be absolutely sure that these references will be freed (especially when doing stuff at Module-level or Class-level for instance, where the scope may be longer-lived than expected).
Similarly, make sure you properly destroy any COM object you create in your classes (and subs/functions. The Class_Terminate destructor must explicitly clean up all. This is also valid when closing forms if you created COM objects (you mentioned using ADOX, scripting objects and regex). In general keeping track of created objects is paramount: make sure you explicitly free all your objects by resetting them (for instance using RemoveAll on a dictionary, then assigning their reference to Nothing.
Do not over-use On Error Resume or On Error Goto. I almost never use these except when absolutely necessary to recover from otherwise undetectable errors. Using these error trapping constructs can hide a lot of errors that would otherwise show you that something is wrong with your code. I prefer to program defensively than having to handle exceptions.
For testing, disable your error trapping to see if it isn't hiding the cause of your crashes.
Make sure that the front-end is local to the user machine, You mention they get their individual front-end from the network but I'm not sure if they run it from there or if it it copied on their local machine. At any rate, it should be local not on a remote folder.
You mention using SQL Server as a backend. Try to trace all the queries being executed. It's possible that the issue comes from communication with SQL Server, a corrupt driver, a security issue that prevents some queries from being run, a query returning unexpected data, etc. Watch the log files and event log on the server closely for strange errors, especially if they involve security.
Speaking of event log, look for the trace of the crash in the event log of your users. There may be information there, however cryptic.
If you use custom ribbon actions, make sure thy are not causing issues. I had strange problems over time with the ribbon. Log all all function calls made by the ribbon.

Error checking overkill?

What error checking do you do? What error checking is actually necessary? Do we really need to check if a file has saved successfully? Shouldn't it always work if it's tested and works ok from day one?
I find myself error checking for every little thing, and most of the time if feels overkill. Things like checking to see if a file has been written to a file system successfully, checking to see if a database statement failed.......shouldn't these be things that either work or don't?
How much error checking do you do? Are there elements of error checking that you leave out because you trust that it'll just work?
I'm sure I remember reading somewhere something along the lines of "don't test for things that'll never really happen".....can't remember the source though.
So should everything that could possibly fail be checked for failure? Or should we just trust those simpler operations? For example, if we can open a file, should we check to see if reading each line failed or not? Perhaps it depends on the context within the application or the application itself.
It'd be interesting to hear what others do.
UPDATE: As a quick example. I save an object that represents an image in a gallery. I then save the image to disc. If the saving of the file fails I'll have to image to display even though the object thinks there is an image. I could check for failure of the the image saving to disc and then delete the object, or alternatively wrap the image save in a transaction (unit of work) - but that can get expensive when using a db engine that uses table locking.
Thanks,
James.
if you run out of free space and try to write file and don't check errors your appliation will fall silently or with stupid messages. i hate when i see this in other apps.
I'm not addressing the entire question, just this part:
So should everything that could
possibly fail be checked for failure?
Or should we just trust those simpler
operations?
It seems to me that error checking is most important when the NEXT step matters. If failure to open a file will allow error messages to get permanently lost, then that is a problem. If the application will simply die and give the user an error, then I would consider that a different kind of problem. But silently dying, or silently hanging, is a problem that you should really do your best to code against. So whether something is a "simple operation" or not is irrelevant to me; it depends on what happens next, or what would be the result if it failed.
I generally follow these rules.
Excessively validate user input.
Validate public APIs.
Use Asserts that get compiled out of production code for everything else.
Regarding your example...
I save an object that represents an image in a gallery. I then save the image to disc. If the saving of the file fails I'll have [no] image to display even though the object thinks there is an image. I could check for failure of the the image saving to disc and then delete the object, or alternatively wrap the image save in a transaction (unit of work) - but that can get expensive when using a db engine that uses table locking.
In this case, I would recommend saving the image to disk first before saving the object. That way, if the image can't be saved, you don't have to try to roll back the gallery. In general, dependencies should get written to disk (or put in a database) first.
As for error checking... check for errors that make sense. If fopen() gives you a file ID and you don't get an error, then you don't generally need to check for fclose() on that file ID returning "invalid file ID". If, however, file opening and closing are disjoint tasks, it might be a good idea to check for that error.
This may not be the answer you are looking for, but there is only ever a 'right' answer when looked at in the full context of what you're trying to do.
If you're writing a prototype for internal use and if you get the odd error, it doens't matter, then you're wasting time and company money by adding in the extra checking.
On the other hand, if you're writing production software for air traffic control, then the extra time to handle every conceivable error may be well spent.
I see it as a trade off - extra time spent writing the error code versus the benefits of having handled that error if and when it occurs. Religiously handling every error is not necessary optimal IMO.

How to locally test cross-domain builds?

Using the dojo toolkit, what is the proper way of locally testing code that will be executed as cross-domain, without making the actual build?
As it appears, there are three possible options (each, with their own drawbacks):
Using local (non xd) XMLHttpRequest dojo.require
This option does not really test the xd behavior, since it dojo.require[s] the js synchronously via XHR.
djConfig.debugAtAllCosts = true;
Although this option does load the required code asynchronously (via the 'script' tag), it also pulls the code in via XHR, parses the dojo.require[s] inside that, and pulls them in. This (using the loader_debug), again, is not what the loader_xd is doing. More info on this topic in a different question.
Creating a cross-domain build
This approach requires a build, which is not possible in the environment which I'm running the code in (We're using our own on-the-fly build process, which includes only the js that is necessary for a particular page. This process is not suitable for development).
Thus, my question: is there a way to use the loader_xd, which does not require an xd build (which adds the xd prefix / suffix to every file)?
The 2nd way (using the debugAtAllCosts) also makes me question the motivation for pre-parsing the dojo.require[s]. If the loader_xd will not (or rather can not) pre-parse, why is the method that was created for testing/debugging doing so?
peller has described the situation. If you wanted to just generate .xd.js file for your modules, you could look at util/buildscripts/jslib/buildUtilXd.js and its buildUtilXd.xdgen() function.
It would take a bit of work to make your own script, but you could look at util/buildscripts/build.js for pointers.
I am hoping in the future for Dojo (maybe Dojo 2.x timeframe) we can switch to a loader that just uses script tags with a module format that has a function wrapper around the module, something that is coded by the developer. This would allow the same module format to work in the local and xd cases.
I don't think there's any way to do XD loading without building and deploying it. Your analysis of the various options seems about right.
debugAtAllCosts is there specifically to solve a debugging problem, where most browsers, until recently, could not do anything intelligent with code brought in through eval. Still today, Firefox will report exception in the console as appearing at the eval site (bootstrap.js) with a line number offset from the eval, rather than from the actual eval buffer, and normally that eval buffer is anonymous. Firebug was the first debugger to jump through some hoops to enhance the debugging experience and permitted special metadata that Dojo's loader injects between the XHR and the eval to determine a filepath to the source. Webkit/Safari have recently implemented this also. I believe debugAtAllCosts pre-dates the XD loader.