I read the Object-Oriented Analysis and Design with Applications.In chapter 4 I read about classification.I dont understand what is exactly classification?
Is about finding object?Is about finding module?
it is categorization of entities or objects or somethings you can imagine using some "common characteristic".
for example,
banana, apple, book, sun, computer, phone
before categorizing,
object -> banana, apple, book, sun, computer, phone
they can be categorized as..
eatable(common characteristic) -> banana, apple
electronics(common characteristic) -> computer, phone
when we categorize somethings, that behavior it self called as classification.
Classification is identifying objects and then grouping them based on of similarities or how related they are.
Related
I'm writing web app in Django 2 - brand new car calculator. I don't know which data structure to choose - relational or object oriented database.
It is well known solution where you choose first model, then type (sedan, combi, etc.), color.
The problem occurs when you choose engine, transmission (automatic, manual) equipment, extra packages. As you know there is so many options and not all are possible to choose.
For example, you can't choose 1.0 petrol engine with automatic transmission and four-wheel drive.
My first choice was object-oriented programming with inheritance but I can't find how to limit options in subclasses.
Any of you did run a similar project?
best regards
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm pretty new to programming with Object Oriented Programming Languages.
So please how do you explain the concept of object oriented programming to a kid?
Some of the key concepts you need to understand are objects and classes for Object Oriented Programming (OOP). This is a very basic explanation, but hope can help you understand other documentation.
Let's compare OOP with chocolate molds. The first thing you have to do to make some chocolates, you need to build its mold. The mold will have some characteristics for the future chocolates like shape, size, etc. depending on how you create the mold the future chocolates will be.
Once the mold is ready you can create the chocolates. All chocolates will take the mold characteristics, will have the same shape and size, but there will be some characteristics on the resultant chocolates that you will be able to modify like i.e. the type of chocolate (black or white), you will also be able to fill in the chocolate with different things like nuts, almonds, peanuts, etc.
So, in this analogy, the mold are classes and they will condition the resultant chocolates. Chocolates are objects created based on a class. Objects are also called instances of a class.
Classes have attributes or variables, on this analogy the attributes would be: chocolate_type: (black/white), chocolate_filler (nuts, almonds, peanuts, nothing, etc), elaboration_date, due_date.
When a new object is created you will have to define each one of their attributes like:
chocolate1: black, filled with nuts, elaborated: 01/01/2016, dd: 03/01/2016
chocolate2: white, filled with almonds, elaborated: 01/01/2016, dd: 03/01/2016
chocolate3: black&white, filled with nuts, elaborated: 01/01/2016, dd: 03/01/2016
Chocolates Analogy
The attributes of a class are defined using variables such as string, boolean, integer, etc.
Also each object can have methods/functions that will define their behavior (what actions each object can perform).
https://en.wikipedia.org/wiki/Object-oriented_programming
Hope, this very basic explanation helped you.
Object Oriented Programming (OOP) is the art of code to some, and a really hostile programming environment to others. OOP is basically when you use constructors/classes to define objects. OOP is beneficial in my profession, because of its developed design patterns such as inheritance and encapsulation. Although OOP does have a few flaws, it's really useful when you want to use one of the 24 design patterns, but can be annoying when dealing with simple functionality. I would recommend it when you would like to create multiple objects with the same methods and values. You should google it for furthermore info & how you can learn it. I recommend author Marijn Haverbeke's book called Eloquent Javascript. The free PDF of the whole book is here. This book helps you master JavaScript and talks a lot about OOP starting from the 6th chapter called "The Life of Objects". I hope this helped you learn more about OOP :)
A design patern to develop an application in a more moduler way, rather than writing a flat coded application & preventing repetition of code.
For example, if i have a bug in the code - i can go and fix the bug in one place, knowing i won't have to edit the same code in different places.
Easlaly organize my code - separetly, to handle different issues in my application.
For example, comunicating with the database - so i'll have specific (class) files that handle my DB.
Customizing my own objects using classes can help define methods, properties and events, that will apply on all created objects later.
For example, if i have a website that singns up users - i can automatically create a new user object that has all the funtionalities that all the other users have.
OOP can help create different settings for different objects without repeating code by creating some kind of a template.
Sooo.. in continue to the previous users example, i can set a general user class that has common general settings, and apply it to all users, extend it with other classes such as admin users, regular users, baned users, ect' - without repeating my code.
Although I hold EE background, I didn't get chance to attend Natural Language processing classes.
I would like to build sentiment analysis tool for Turkish language. I think it is best to create a Turkish wordnet database rather than translating the text to English and analyze it with buggy translated text with provided tools. (is it?)
So what do you guys recommend me to do ? First of all taking NLP classes from an open class website? I really don't know where to start. Could you help me and maybe provide me step by step guide? I know this is an academic project but I am interested to build skills as a hobby in that area.
Thanks in advance.
Here is the process I have used before (making Japanese, Chinese, German and Arabic semantic networks):
Gather at least two English/Turkish dictionaries. They must be independent, not derived from each other. You can use Wikipedia to auto-generate one of your dictionaries. If you need to publish your network, then you may need open source dictionaries, or license fees, or a lawyer.
Use those dictionaries to translate English Wordnet, producing a confidence rating for each synset.
Keep those with strong confidence, manually approving or fixing through those with medium or low confidence.
Finish it off manually
I expanded on this in the "Automatic Translation Of WordNet" section of my 2008 paper: http://dcook.org/mlsn/about/papers/nlp2008.MLSN_A_Multilingual_Semantic_Network.pdf
(For your stated goal of a Turkish sentiment dictionary, there are other approaches, not involving a semantic network. E.g. "Semantic Analysis and Opinion Mining", by Bing Liu, is a good round-up of research. But a semantic network approach will, IMHO, always give better results in the long run, and has so many other uses.)
I am new to object oriented programming and learning to design classes.
I am wondering how can i design a class which holds list of itself.
For example, I have a class named Game will following definition:
Game
title
description
screenshot
flash (holds flash game object)
I want to display list of games on a page. What approach is good for it?
Either create another class GameList and create array of Games to be listed or
create a function Game.ListAll to display game list?
I feel former is better approach to do so. I need your guide please.
Also i dont know what to actually study to clear my concepts in designing class and their relationships etc.
Can you please suggest me a book CBT which is easy to understand ?
Thank you very much.
-Navi
I don't know what languague your are using but I can recommend "Agile Priciples, Pattens and Practices" from Robert C. Martin (there is a C# version too). There is a lot about OO design in there too, together with all the important patterns and a nice intro to TDD and Agile - and it's fun to read.
To your question: yes go for your first idea or use a inbuild collection like List<Game> (should be the same no matter if you are using C# or Jave)
I think it would be a bad design to have a (static) method listAll() within Game: Games shouldn't know about other games, you would need the constructors or factory methods to take care of the list, and you couldn't handle separate lists.
So use a list outside of Game. If you do not need special behavior of the GameList, a field List where you treat the list will do. If you have special behavior, your design with GameList is the right choice.
For good OO design, I would read up in the following order:
the relevant articles from the object mentors article list to get an overview. You can find links to the most relevant for you at principles of ood.
a book about design patterns, my favorite still being the classical Design Patterns from the GoF
read Agile Software Development, Principles, Patterns, and Practices from Uncle Bob
I'm currently investigating the options to extract person names, locations, tech words and categories from text (a lot articles from the web) which will then feeded into a Lucene/ElasticSearch index. The additional information is then added as metadata and should increase precision of the search.
E.g. when someone queries 'wicket' he should be able to decide whether he means the cricket sport or the Apache project. I tried to implement this on my own with minor success so far. Now I found a lot tools, but I'm not sure if they are suited for this task and which of them integrates good with Lucene or if precision of entity extraction is high enough.
Dbpedia Spotlight, the demo looks very promising
OpenNLP requires training. Which training data to use?
OpenNLP tools
Stanbol
NLTK
balie
UIMA
GATE -> example code
Apache Mahout
Stanford CRF-NER
maui-indexer
Mallet
Illinois Named Entity Tagger Not open source but free
wikipedianer data
My questions:
Does anyone have experience with some of the listed tools above and its precision/recall? Or if there is training data required + available.
Are there articles or tutorials where I can get started with entity extraction(NER) for each and every tool?
How can they be integrated with Lucene?
Here are some questions related to that subject:
Does an algorithm exist to help detect the "primary topic" of an English sentence?
Named Entity Recognition Libraries for Java
Named entity recognition with Java
The problem you are facing in the 'wicket' example is called entity disambiguation, not entity extraction/recognition (NER). NER can be useful but only when the categories are specific enough. Most NER systems doesn't have enough granularity to distinguish between a sport and a software project (both types would fall outside the typically recognized types: person, org, location).
For disambiguation, you need a knowledge base against which entities are being disambiguated. DBpedia is a typical choice due to its broad coverage. See my answer for How to use DBPedia to extract Tags/Keywords from content? where I provide more explanation, and mentions several tools for disambiguation including:
Zemanta
Maui-indexer
Dbpedia Spotlight
Extractiv (my company)
These tools often use a language-independent API like REST, and I do not know that they directly provide Lucene support, but I hope my answer has been beneficial for the problem you are trying to solve.
You can use OpenNLP to extract names of people, places, organisations without training. You just use pre-exisiting models which can be downloaded from here: http://opennlp.sourceforge.net/models-1.5/
For an example on how to use one of these model see: http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#tools.namefind
Rosoka is a commercial product that provides a computation of "Salience" which measures the importance of the term or entity to the document. Salience is based on the linguistic usage and not the frequency. Using the salience values you can determine the primary topic of the document as a whole.
The output is in your choice of XML or JSON which makes it very easy to use with Lucene.
It is written in java.
There is an Amazon Cloud version available at https://aws.amazon.com/marketplace/pp/B00E6FGJZ0. The cost to try it out is $0.99/hour. The Rosoka Cloud version does not have all of the Java API features available to it that the full Rosoka does.
Yes both versions perform entity and term disambiguation based on the linguistic usage.
The disambiguation, whether human or software requires that there is enough contextual information to be able to determine the difference. The context may be contained within the document, within a corpus constraint, or within the context of the users. The former being more specific, and the later having the greater potential ambiguity. I.e. typing in the key word "wicket" into a Google search, could refer to either cricket, Apache software or the Star Wars Ewok character (i.e. an Entity). The general The sentence "The wicket is guarded by the batsman" has contextual clues within the sentence to interpret it as an object. "Wicket Wystri Warrick was a male Ewok scout" should enterpret "Wicket" as the given name of the person entity "Wicket Wystri Warrick". "Welcome to Apache Wicket" has the contextual clues that "Wicket" is part of a place name, etc.
Lately I have been fiddling with stanford crf ner. They have released quite a few versions http://nlp.stanford.edu/software/CRF-NER.shtml
The good thing is you can train your own classifier. You should follow the link which has the guidelines on how to train your own NER. http://nlp.stanford.edu/software/crf-faq.shtml#a
Unfortunately, in my case, the named entities are not efficiently extracted from the document. Most of the entities go undetected.
Just in case you find it useful.