Talking about Search Algorithm & Interface

Dear Readers,

We worked hard on this subject : Search Algorithm & Search Interface, for both : Vanilla & AklaBox, and result is promising. Question was : “why this list of results when I type a keyword ?”. And also : “why a simple list ?”

“blablabla” … why the word count should be a key factor to move a document at top of the list ? Repetition of a word don’t make always the document accurate, except in brain washing mode, right ?

Why we should use a method like “most viewed documents are top of the list” ? Why the list should contain first the latest published documents ? Should we push the geographic zone or some author because they are famous ? I read that Google was pushing the Website that are “mobile ready”… does it make the document more accurate ? We can add many other rules to display the result … and we can restart from scratch : When I’m looking for a keyword, does the document need to have this keyword to be listed ? … just rethink this …

Why User profile matters when I search : What is a Skyline : a datamining subject or an airplane situation ? what is the cloud for an IT person or for a dreamer … Why my personal past search should influence my next results ?

Because we are talking about Search Engine (algorithm) and Search Interface, to work on our set of documents (either Reports / External document in Vanilla, or document / Web Site with AklaBox), the answer to those subjects have no limits but the user interface

On Search Algorithm, we put this outside as a dedicated component, easy to embed, easy to modify, easy to customized. We believe some customers need specifics algorithms to run their search, and they want to know and control this algorithm (that’s another subject : learning machin with search algorithm). Thanks to “R”, this subject is under control, with awesome results

In terms of Search Interface, this is some example of what is available : decision tree, maps, cluster …

AklaBox GeoSearch

AklaBox Mind Map Result

AklaBox Mind Map Result

We are talking about information, Knowledge Management … but what do we do with those information ? Do we use it to take the right decisions ? Do we always replicate the same behavior, to get the same results ?

… its a morning post about Search … so, it made sense to have questions ..

Have Fun !

Patrick

Vanilla 5 – Technology update

Dear Readers,

A common question about Vanilla v5 technology update : which component are new, whic component have been updated, I guess a very sensitive question when moving to a new major version of the platform.

This post is an early tentative to discuss some of those components, as the list is not limited to what is listed below, and some components will still get a “last minute update” (like “R” package, just updated to version 3.2)

Java Corner

About development Studio, we have moved to Eclipse 4.3 and kept this version (we didn’t moved yet to Luna, Eclipse 4.4). This comes with some funny problems, like having to run a xulrunner for VanillaDashboard, but globally, this Eclipse upgrade was smoother than the previous one.

Java7

Java support is still Java7. move to java 8 is not yet valiated, because we have many components which are not yet certified for java8

Web Framework side, we moved to Gwt 2.5.x (we didn’t moved yet to Gwt 2.6, and version 2.7 will be the next version we will evaluate when it’s time to migrate this Framework

Vanilla Portal is a certified html5 portal, with support for the 5 major browser : Internet Explorer, Chrome, FireFox, Opera and Safari.

Server Side

we migrated to the latest Tomcat 7 version, together with providing a full functional stand alone version of the Vanilla server with an h2 embeded database.

Hadoop Side, Vanilla 5 platform has been certified with the latest Cloudera (C5) and HortonWorks (Data Platform 2.1) Hadoop platforms

Vanilla certified Platforms

Vanilla certified Platforms

Components & Libraries

We have upgraded to the latest version of FusionChart XT (trial version), to enjoy the lastest html5 chart display (again, FusionChart team is doing a great job !!!), and we migrated also to hibernate 4.2.8 (clealy, this post is not the right place for the complete list of the updated jar and packages)

OSM support is now integrated inside any of our studio, providing support & renderer for OpenStreetMaps visualization.

Additional server component have been also upgraded :

Solr/Lucene is now version 5

ApacheSolr

R package is soon version 3.2 (it will be available this week, together with Vanilla 5 rc9, as an upgrade from version 3.1.2)

R

… maintenance and upgrade of a platform is a subject in itself, and we are not really setting the rules : why Java8, why Windonws10 and its new browser (so, IE will be discontinued), why a new standard (Html5) took the place of a former one … Vanilla v5 will be a LTS version, so we have to be ready to provide support on updated components !

Have Fun !

Patrick