Thursday, December 8, 2011

Talend an ETL Tool


ETL is an abbreviation of the three words Extract, Transform and Load. It is an ETL process to extract data, mostly from different types of systems, transform it into a structure that’s more appropriate for reporting and analysis and finally load it into the database and or cube(s).
ETL – EXTRACT FROM SOURCE
ETL – TRANSFORM THE DATA


Real-life ETL cycle
Build reference data
Extract (from sources)
Validate
Transform (clean, apply business rules, check for data integrity, create aggregates or disaggregates)
Stage (load into staging tables, if used)
Audit reports (for example, on compliance with business rules. Also, in case of failure, helps to diagnose/repair)
Publish (to target tables)
Archive


Extract
The purpose of the extraction process is to reach to the source systems and collect the data needed for the data warehouse.
Usually data is consolidated from different source systems that may use a different data organization or format so the extraction must convert the data into a format suitable for transformation processing.
The complexity of the extraction process may vary and it depends on the type of source data.
The extraction process also includes selection of the data as the source usually contains redundant data or data of little interest. 

For the ETL extraction to be successful, it requires an understanding of the data layout. A good ETL tool additionally enables a storage of an intermediate version of data being extracted. This is called "staging area" and makes reloading raw data possible in case of further loading problem, without re-extraction. The raw data should also be backed up and archived. 

Transform
The transform stage applies a series of rules or functions to the extracted data from the source to derive the data for loading into the end target. Some data sources will require very little or even no manipulation of data. In other cases, one or more of the following transformation types may be required to meet the business and technical needs of the target database:
Selecting only certain columns to load (or selecting null columns not to load). For example, if the source data has three columns (also called attributes), for example roll_no, age, and salary, then the extraction may take only roll_no and salary. Similarly, the extraction mechanism may ignore all those records where salary is not present (salary = null).
Translating coded values (e.g., if the source system stores 1 for male and 2 for female, but the warehouse stores M for male and F for female)
Encoding free-form values (e.g., mapping "Male" to "1")
Deriving a new calculated value (e.g., sale_amount = qty * unit_price)
Sorting
Joining data from multiple sources (e.g., lookup, merge) and deduplicating the data
Aggregation (for example, rollup — summarizing multiple rows of data — total sales for each store, and for each region, etc.)
Generating surrogate-key values
Transposing or pivoting (turning multiple columns into multiple rows or vice versa)
Splitting a column into multiple columns (e.g., putting a comma-separated list specified as a string in one column as individual values in different columns)
Disaggregation of repeating columns into a separate detail table (e.g., moving a series of addresses in one record into single addresses in a set of records in a linked address table)
Lookup and validate the relevant data from tables or referential files for slowly changing dimensions.
Applying any form of simple or complex data validation. If validation fails, it may result in a full, partial or no rejection of the data, and thus none, some or all the data is handed over to the next step, depending on the rule design and exception handling. Many of the above transformations may result in exceptions, for example, when a code translation parses an unknown code in the extracted data.
The transform stage of an ETL process involves an application of a series of rules or functions to the extracted data. It includes validation of records and their rejection if they are not acceptable as well as integration part. The amount of manipulation needed for transformation process depends on the data. Good data sources will require little transformation, whereas others may require one or more transformation techniques to to meet the business and technical requirements of the target database or the data warehouse. The most common processes used for transformation are conversion, clearing the duplicates, standardizing, filtering, sorting, translating and looking up or verifying if the data sources are inconsistent. A good ETL tool must enable building up of complex processes and extending a tool library so custom user's functions can be added.

Load
The loading is the last stage of ETL process and it loads extracted and transformed data into a target repository. There are various ways in which ETL load the data. Some of them physically insert each record as a new row into the table of the target warehouse involving SQL insert statement build-in, whereas others link the extraction, transformation, and loading processes for each record from the source. The loading part is usually a bottleneck of the whole process. To increase efficiency with larger volumes of data we may need to skip SQL and data recovery or apply external high-performance sort that additionally improves performance.
Talend Profile
Its GUI is made of a metadata repository and a graphical designer. The metadata repository contains the definitions and configuration for each job - but not the actual data being transformed or moved.
The information in the metadata repository is used by all of the components of Talend Open Studio.
Talend can be used
Synchronization or replication of databases
Right-time or batch exchanges of data
ETL (Extract Transform Load) for analytics
Data migration
Complex data transformation and loading
Data quality
Best Part
Talend Open Studio operates as a code generator allowing data transformation scripts and underlying programs to be generated either in Java (since v2.0) or Perl (since v1.0).
Of course we can customize wherever needed.

Tuesday, September 27, 2011

Salesforce Winter 12 Release


Salesforce's Winter 12 Features Announced

After Dreamforce 11, salesforce has just published the Winter 12 release notes. There are many new features in this release, and some influential ones. I am posting few of them
  • Analytics Enhancements - Coming out of Dreamforce (more on my Dreamforce thoughts in another post), this is what I was most excited about. Starting with this release, there is a terrific roadmap of new features for reports and dashboards, which is going to be a big win for customers. Just to name a few of the new ones in Winter 12:

    • Dashboard Filters - Tired of creating a clone for a dashboard just to get a cut on a different metric, like Region or Industry? Dashboard filters allows you to add a filter to the dashboard, that when changed, applies to the filter to all the dashboard components. A small, but terrific feature. For now, it can only be a picklist, lookup or text field. (let's hope date fields are close behind)
    • New Reports Tab - A slick re-design of the Reports tab. The image shows it nicely, but essentially it combines Reports and Dashboards into one place to find and run them. An enhanced search has also been added for those customers with lots of reports or dashboards. A few other additions helps admins manage the folders and creation a bit better, so all around a good addition.
    • Visualforce Charting - This is a pilot, but salesforce showed it at Dreamforce and it looked very slick. Essentially, visualforce reports are much easier to build as there are now pre-built components for reports and charts which means you won't need to start from scratch building them.
    • Reports are now in the Recent Items list

  • Chatter Enhancements - As expected, there's a few new Chatter enhancements, but I almost see this as the next level of Chatter. It's no longer just a feed.

    • Chatter Messages - Messaging and Presence now built directly into salesforce. This is a big collaboration enhancement, as now users can see if someone is logged into salesforce, and send them a private message. Just another step to allowing the team to remain in salesforce to get all the information they need.
    • Customers in Private Chatter Groups - Your Customers (or Partners, which I think is even bigger), can now be invited into private and secure Chatter groups so they can collaborate with your salesforce users. Share files, posts, messages, etc. just like you would with a normal salesforce user. Best of all, it's free.
    • Chatter Approvals - Approvals can now be managed through the Chatter feed
  • SiteForce - Salesforce showed this off at Dreamforce 10, and it's coming live in Winter 12. Too many details to cover here, but essentially, SiteForce is a CMS editor to create pages for Sites easier. Currently, if you want to really change your portal look and feel, you'd need to create visualforce pages. With this, you get much more flexibility in what you can customize, and from the look of it, the ability to do it a lot easier than visualforce.
  • Social Contacts - Directly from the Contact record, you can now view a Contact's Twitter, Facebook, or Linkedin feed. This is the next step in Salesforce's push to "socialize" the enterprise and I can see where Radian6 will step in soon to enhance this even further. For now, this is simply a way to view the feeds, as long as you know your contact's user ID for these services. Speaking of which...
  • Data.com - Jigsaw is now called data.com. First, one of the big announcements at Dreamforce was that Data.com now contains a subset of D&B information. This is big. Previously, Jigsaw was great for contact data, but pretty poor around Account information. D&B was the exact opposite. Now, with one service you can get the best of both, completely integrated with salesforce and with the ability to make it even more powerful with API access. This is becoming a must have if you do serious lead nurturing or outbound calling. In addition to this announcement, there are a few new enhancements within salesforce, including the ability for users to save searches and improved exporting. Finally, back to the Social Contacts. With salesforce.com's focus on the "social enterprise", it is clear that Data.com will be making a strong push to start capturing company's and contact's social accounts as well as their email and phone.
  • Forecasts - A revamp of the Forecasts feature. A new, impressive looking UI, and some nice improvements to the usability. From the notes, it's hard to tell if it is more flexible though, which has always been the biggest issue.
  • Salesforce for Outlook now supports Outlook 2010, including 64 bit. Still no ability to create Cases from an email though.
  • CTI 4.0 - Of course, this doesn't mean the phone vendors will support it just yet, so may be a lag before you can take advantage of this. Biggest part of this is it now supports Chrome as well the Mac.
Winter 12 is a big release with great enhancements and a lot of smaller enhancements to really fill it out. Please comment if you find out more.

Monday, September 26, 2011

Salesforce CRM Overview


Software as a service (SaaS), sometimes referred to as "on-demand software," is a software delivery model in which software and its associated data are hosted centrally on a service offering compnies premise(called as on "cloud") and are typically accessed by users using a thin client, normally using a web browser over the Internet.
Force.com is a platform provided by salesforce which is totaly on cloud. It is generally known as PaaS (Platform as a Service). Customer need not to maintain any Application or database server simply buy licenses to use services on per user basis.

Salesforce application operate on Multitenant Environment. A multitenant application cost-efficiently shares 
a single stack of resources to satisfy the needs of multiple tenant (organization).

Single-tenant systems give a user its own database and its own instance of the software application. Placed on either  individual server, or segregated via extensive security controls to create its own virtual server, users of single-tenant systems enjoy the benefits of significant configurability of software, robust functionality, and enhanced security.

i think below screen shot make more sense.
Single Tenant Architecture

Multitenant Architecture



Salesforce Admin Exam                               Identity Confirmation  





Field datatype mapping between Oracle/SQL Server and Salesforce

Over the time I have started developing a tool that pulls salesforce metadata information and creates oracle/sql server “create table” scripts. Isn’t that really awesome!!
The most useful part of the tool is field data type mapping i.e, what data type of salesforce maps to what data type of oracle/sql server. Below is the list of mapping that I used while generating create table scripts – enjoy!!

salesforce data typesql server data typeoracle data type
booleanbitvarchar2(1)
datesmalldatetimedate
datetimeDatetimedate
currencydecimal(precision,scale)number(precision,scale)
doubledecimal(precision,scale)number(precision,scale)
intIntnumber(10)
picklistnvarchar(255)varchar2(255)
idnvarchar(18)varchar2(18)
referencenvarchar(18)varchar2(18)
textareanvarchar(max)varchar2(4000)
emailnvarchar(255)varchar2(255)
phonenvarchar(255)varchar2(255)
urlnvarchar(255)varchar2(255)
textareanvarchar(max)varchar2(4000)
multipicklistnvarchar(max)varchar2(4000)
anyTypenvarchar(max)varchar2(4000)
percentdecimal(5,2)number(5,2)
comboboxnvarchar(max)varchar2(4000)
base64nvarchar(max)varchar2(4000)
timenvarchar(255)varchar2(255)
stringnvarchar(length)varchar2(length)

Sunday, September 25, 2011

Salesforce CRM Overview

In this post i am going to cover all about Salesforce CRM overview, however the topic is vast enough and can take 100 pages of blogs but we will stick to the Admin Exam's perspective.

Wednesday, August 10, 2011

All About Sales Force Admin Exam

The distribution of Admin exam question based on topic as given by SalesForce can be seen below.This post will be a series that will cover all topic upto the good descriptive level and i can assure you this series will get through Admin exam easily.

Objective
Weight
Salesforce CRM Overview
3%
Describe the software-as-a-service (SaaS) model

List and describe the different Salesforce applications

List and describe the standard objects

Identity Confirmation (setting login restrictions)
2%
Describe the Identity Confirmation feature

Describe the differences between logging in through the API versus the UI

Explain the concept of Login Hours and Login IP ranges

Add and delete an IP range

Describe the methods to allow access to the application

Navigation – User Interface (UI) Settings
3%
Identify specific areas of the Salesforce Home Tab

Configure Search settings for your organization

Customize the Tabs a user can access or select from Tab settings

Locate, identify, and explain the UI features an administrator controls

Organization Administration
5%
List what information is included in the Company Profile

List the places where Currency is specified

Describe the effect of changing your company’s default Time Zone, Currency, Locale, and Language

Describe the effect of enabling the Custom Fiscal Year setting

Create and manage letterheads and email templates

Customization
17%
Describe a Profile

Explain what a Profile controls

List the Standard Profiles

List the features available for managing profiles

Evaluate when to create a Custom Profile

Explain the differences between a Custom Field and a Standard Field

List the different types of Custom Fields

Set up a Custom Field

Map Custom Lead Fields

Re-label a Standard Object

List objects for which you can enable Fields History Tracking

Define the dependencies when using a Dependent Picklist

Create and modify a Dependent Picklist

Describe the capabilities of Custom Lookup fields

Create and modify a Lookup Field

Describe the capabilities of Formula Fields

Create and modify a Formula Field

Describe Roll-up Summary Fields

Create and modify a Roll-up Summary Field

Describe the elements within a Page Layout

Create and modify a Page Layout

Create, customize, and print a List view

Customize a Related List

Describe how to use Record Types

Create and modify a Record Type

List the objects that require a business process

Explain how Field-Level Security affects page layouts and visibility

Set up Field-Level Security

Security & Access
13%
List the information captured on the User Record

Create and maintain User Records

Explain the Record Owner concept

Describe the elements of the Sharing model

Describe the scope and capabilities of Organization-Wide Defaults

Explain how access is granted through the Role Hierarchy

Set up Organization-Wide Defaults

Describe the use of Roles

Build a Role Hierarchy

Assign users to Roles

Mass-transfer records from one user to another

List the objects that may have Sharing Rules

Build Sharing Rules

Share records manually

Describe the use cases of Public Groups and where to use them

Compare and contrast Sales with Account teams

List the places to use Folders

Describe how Folder access differs from Record access

Create Folders to organize and provide access to data

Workflow
5%
Explain how Salesforce implements Workflow and when to use Salesforce Workflow

List and describe use cases for the four Workflow Actions

Explain the capabilities and settings of a Workflow Rule

Describe the differences between immediate and time-dependent Workflow actions

Set up a Workflow Rule and associated actions

Workflow Approval Processing
3%
Define the basics of the Workflow Approval Process

Compare and contrast the Approval Processes versus Workflow

List the differences between the Jump Start Wizard and the Standard Approval Process Wizard

Explain how to route Approvals

Create, modify, and test the Approval Process

Identify the settings in creating Workflow Approval Processing

Data Validation
3%
Define Data Validation

List common use cases for Data Validation

Describe when Data Validation Rules are enforced

Create Data Validation Rules

Data Utilities
5%
Describe the function of an External ID

List and describe the different tools and use cases for data migration

Update existing data via import

Define when to use the Data Loader

List the functions of the Data Loader

Mass Update Records

Mass Delete Records

Describe the capabilities of the Recycle Bin

Request and download a weekly export

Explain how Salesforce allocates storage

Analytics
13%
Describe the administrative restrictions on Standard Reports

Run a Standard Report

Create a Custom Report

Search for Custom Reports and Dashboards

Explain the differences among report types

Explain Custom Report Types

Create charts to display Report results

Use Advanced Filter criteria to narrow Report results

Create Custom Summary Formulas for Summary Reports

Choose settings for Conditional Highlighting

Describe the capabilities and purpose of a Dashboard

List Dashboard component types

Create a Dashboard and add components to it

Print and export Reports

Email and schedule Reports and Dashboard refresh

Define Running User and how it affects the Dashboard results

Marketing Administration
5%
Define a Campaign

Describe when to use Campaign Hierarchies

Create a Campaign targeting both existing customers and new Leads

List the features used to create an Automated Campaign

List the methods to associate Leads and Contacts to a Campaign

Describe how to use Lead Queues to manage Leads

Describe how to use a Lead Assignment Rule to assign and route Leads

Describe the use cases for Web-to-Lead

Describe the functionality of a Lead Auto-Response Rule

Build a Lead Queue, Assignment Rule, Web-to-Lead, and corresponding Auto-Response Rule

Measure Campaign results

Describe the process and effects of converting a Lead

Service & Support Administration
5%
Define a Case

Describe use cases for Case Hierarchies

Describe how to use Case Queues to manage Cases

Describe how to use a Case Assignment Rule to assign and route Cases

Describe use cases for Web-to-Case

Describe uses cases for Email-to-Case

Describe the functionality of the Case Escalation Rule

Describe the functionality of the Case Auto-Response Rule

Describe the functionality of the Business Hours

Configure the settings and rules for Case processing, including Business Hours, Escalation Rules, and the use of Email Templates

Define Solutions and how they can be associated with Cases

Build a Solution Category tree and assign Solutions to Categories

Define suggested solutions and explain where to use them

Describe what content is accessible via the Self-Service Portal

Describe the ways to access the Self-Service Portal

Build a Case Queue, Assignment Rule, Web-to-Case, and corresponding Auto-Response Rule

Describe use cases for Case Teams

Salesforce Console
2%
Describe the functionality of the Salesforce Console

Describe the components of the Console Tab

Set up the Salesforce Console

Navigate within the Salesforce Console

Custom Objects, Applications, & Tabs
7%
Describe a Custom Object

Describe the types of relationships that can exist between objects

Create a Custom Object

Describe a Custom Object Tab and a Web Tab

Create a Custom Object Tab

Create a Web Tab

Describe a Custom App

Create a Custom App

The AppExchange
2%
Describe the AppExchange

Describe the characteristics of Managed Packages

Install a package

Uninstall a package

Activities
2%
Compare and contrast Tasks and Events

Manage Public Calendars and Resources

Identify and describe the Activity Settings

Describe the capabilities of Multiday Events

Advanced Configuration Options
5%
Describe the Client Management feature

Describe the Territory Management feature

Describe Advanced Currency Management


Our study will include 18 complete post as you can see above 18 blue heading.
so here is first tutorial in the Admin Series.