Introduction to data mining

--Originally published at Enro Blog

What is data mining.

In data mining, the data is stored electronically and the search is automated— or at least augmented—by computer.

Data mining is defined as the process of discovering patterns in data. The process must be automatic or (more usually) semiautomatic. The patterns discovered must be meaningful in that they lead to some advantage, usually an economic advantage. The data is invariably present in substantial quantities.

How are the patterns expressed? Useful patterns allow us to make nontrivial predictions on new data. There are two extremes for the expression of a pattern:

  • as a black box whose innards are effectively incomprehensible and as a
  • transparent box whose construction reveals the structure of the pattern.

Such patterns we call structural because they capture the decision structure in an explicit way

Machine learning

Things learn when they change their behavior in a way that makes them perform better in the future.

domain knowledge

Market basket analysis is the use of association techniques to find groups of items that tend to occur together in transactions, typically supermarket checkout data.

What’s the difference between machine learning and statistics? Cynics, looking wryly at the explosion of commercial interest (and hype) in this area, equate data mining to statistics plus marketing. In truth, you should not look for a dividing line between machine learning and statistics because there is a continuum—and a multidimensional one at that—of data analysis techniques. Some derive from the skills taught in standard statistics courses, and others are more closely associated with the kind of machine learning that has arisen out of computer science. Historically, the two sides have had rather different traditions. If forced to point to a single difference of emphasis, it might be that statistics has been more concerned with testing hypotheses, whereas machine learning has been Continue reading "Introduction to data mining"

A new machine learning challenge for the upcoming semester.

--Originally published at Enro Blog

As the semester starts a new challenge has appeared, I was assigned with the task of creating a software capable of filtering information in the abstracts of research papers for the purpose of classifying them and creating a network of people that are working more or less in topics in the same area, as it seems that the universities lack founding for every single researcher since research investigation have grown lots in the last couple of decades, a software capable of aggregating professionals with the same interests could potentially reduce research costs. In the upcoming posts I’ll summaries my research toward my machine learning studies, findings and understandings.

Let’s do it! – The trial begins

--Originally published at TI2011 – Project Evaluation and Management

Resultado de imagen para lets do it

This post will be focused on all that I learned from reading Survival Guide and Project Evaluation and Management (TI2011) and how I am going to apply all of this.

Firstable, The next week I am going to be part of a development project, in which the team will be of 3 developers and I’m one of them, so the project seems to be consistent because the project requirements and project process have been taken from one month ago.

Thus, reviewing the requirements I start to make my own plan besides the sprints and delivery times that are stablished at the beggining. Also I am trying to fixed my schedules and designed the architecture for every delivery and see how every one will affect the other ones.

One of the important things that I think it is important is to see if the project could have risks and how can every development and stage will affect in the whole project. Also something I like for the plan that is that every development has contemplated the code time to do it and some extra time to the testing cases of each one.

So, finally I am eager to see what is coming and to make the most of this experience of being part of a project from the beginning.


Pragmatic Projects – Tips

--Originally published at TI2011 – Project Evaluation and Management

Chapter 8 – Pragmatic Projects by Hunt, A. &Thomas, D.

Resultado de imagen para tips

So, as your projects gets under way, we need to move away from individual issues and code project-sized issues. We are not going to detail into specifics project managment, but talk about improve and work on critical areas that can make or break any project.

Stablish some ground rules

As you have more than one person working on a project, make clear and stablish some ground rules and delegate parts of the project. This is how a Team or Pragmatic Team should work.

Automate your procedures

This is one of the single most important factor in making project activities work consistently and reliably.

Test your code

Most developers hate testing. Finding bugs is like fishing with a net. We use fine, smal nets to catch minnows, and big, coarse nets (integration tests) to catch the killet sharks. So TEST EARLY, TEST OFTEN and TEST AUTOMATICALLY.

Documentation

It is clear that it is one of the most things that developers dislike. But keep in mind that it will help you and the user to have a better image and scope of the project.

Sign your work

Developers must rejoice in accepting challenges and if we are responsible for a design, or piece of code, then sign your work.


Build the software

--Originally published at TI2011 – Project Evaluation and Management

Chapter Fourteen – Software Project Survival Guide by Steve McConell

Resultado de imagen para build software

When we are planing a project, we try to visualize how is going to work, which tools we will need, how many time do we need and other questions that we are talking about in the previous post. So, in this post I am going to talk about Construction stage during a project.

Construction is the stage where we create and bring to life the project. If you made the necessary plans, then you will even be able to make more “fancy” your project, perhaps add something that you believe it is not on the requirements but it will help a lot the user.

So, this chapter give us some recommendations in order to add more to the construction:

  1. Develops a piece of code.
  2. Unit tests the code.
  3. Interactive debugger.
  4. Integrates preliminary code with a private version of the main build.
  5. Submits code for technical review.
  6. Test case preparation.
  7. Code is reviewed.
  8. Fix any problems identified during the review.
  9. Fixes are reviewed.
  10. Integrate final code with the main build.
  11. Code is “complete.”

This recommendations are very useful because those help us to reduce errors, make us sure that we submitted the code and it covers all the features that it was proposed.

For this stage we should have an order, start building the skeleton of the project, keep in mind the previous recommendations, documenting all changes and test and fix according with the previous list.


Detailed Plan

--Originally published at TI2011 – Project Evaluation and Management

Chapter Thirteen – Software Project Survival Guide by Steve McConell

It is important to make a detailed design and it is what this chapter talks about. A detailed design can affect the project. It uses a staged approach, at each stage the developers design milestones that will be delivered at the end of each one. So, having a good architecture will help us to focus and get the plan for the current stage that we are working for.

Also in this stage is when developers check and reuse components and code for others projects, sometimes it could happen that when you reuse a lot of components you make this into a library that you can use for any project that will need those.

Every detail is important during this stage, because the requirements must be clear, if not it must be resilved during another stage but it should be informed about this delay (that is a bad idea). You must consider that developers must continue doing detailed design until they get milestone for the construction.

If many persons know each part of the project with the detailed design, then you will have at least another plan to continue with the project.

Resultado de imagen para flow project toon

Finally, if you make a perfect detailed design, but no one cannot read it, then you’re doing something wrong. Be sure to use detailed design on your projects, it will help to deliver your project during the delopment process.

 


Plan inside a Plan?

--Originally published at TI2011 – Project Evaluation and Management

Chapter Twelve – Software Project Survival Guide by Steve McConell

When we are going to work on a project, there is several stages to accomplish the whole plan. Each stage is a part of the plan for the project like a small project inside the whole project, this means that in each stage we have to plan, design, and test for that stage release. So, I want to define something that the book called Staged Planning, this is done at the beggining of every staged delivery cycle where there is a individual stage plan.

For this plan of a stage, we use mini milestones, they help us to track progress and reduce risks. So, the development plan guide the project, but the stage plan guide each stage of the project, and it is shorter than the main plan.

The stage plan should include every change of the requirements, designs, project architecture, code for that stage and test cases. This also includes something that is important, the risk management. At the start of each stage we should keep the track of the top priorities for the stage in order to get a deliverable product for that stage and get higher chances of getting an advanced.

Resultado de imagen para milestones project

In conclusion, planning is the key to success, of courses it can be changes on the plan, risks but that it is the reason that we divide the development plan to small plans in order to be able to mange those changes in individual stages and for that is that we use the stage plan, divide the changes and be able to control it.

 


Almost done? – Final preparations

--Originally published at TI2011 – Project Evaluation and Management

Chapter Eleven – Software Project Survival Guide by Steve McConell

Resultado de imagen para almost done

Once we know the important things about how we are going to plan, design, avoid and develop our proyect then we are ready for final preparations.

The developers team is ready to create its first estimates, develop plans for deliverys and get a solid organization on the project.

So, lets talk about create meaningful estimates, because it is posible that the team can give its own estimates, but they must know that the estimates must contain effort, cost, and shcedule, should be written, assuming that the team will not work overtime, it should be created over estimation software, should be based on previous projects. All of this must have a lot of organization because estimates can change everytime.

Keep in mind risks in every stage and estimate in the project. Write a Staged Delivery Plan, deivering in stages with functionalities.

It is important that everything (risks, changes, bugs) can happen on a project, so the project estimation is necessary and important because is the way to know what you will deliver and do, and always try to see if the estimate will be reachable in order to say yes or no for the develoment.

 

 


Architecture

--Originally published at TI2011 – Project Evaluation and Management

Chapter 10 – Software Project Survival Guide by Steve McConell

When we work on a project, we always try to design what it is going to do, what scheme or design it must have to use and the structure of it. So basically the last thing it is called as software Architecture that provides us the technical structure for a our software project. And if it is good, it will make us the rest of the project easy. Becacuse it can provide us program organization, ways in which the architecture supports changes, components that can be reused and design approches.

The design of a software architecture is a phase during the software development in which it is mapped out using prototypes and design diagrams (UML, sequence diagrams, etc).

Thus, which are the characteristics for a good architecture?

  • System Overview
    • Descrie the system in broad terms. Build a coherent picture from details, makes classes or modules.
  • Conceptual integrity
    • Objectives must be stated clearly. Design with a primary objective or goal.
  • Subsystems and organization
    • Define subsystems in a program. It gives more functionality such as data storage, analysis, user input, and so on. Design a scheme or diagram of subsystems that make it a complete system.
  • Notation
    • For large projects, should adopt a standard notation such as UML. For smaller, makes sure that everyone understands what the diagrams mean.
  • Change scenaruos and change strategy
    • Identify program areas that are most likely to change. Know your project scope and decide how much the damage is going to be.
  • Reuse analysis and BUY vs BUILD decisions
    • Decide which components will be purchased commercially, which will be reused from internal code and which will be created from scratch.
  • Approaches to functional areas