France to force programming teaching starting at age 10

Looks to me like France.. Of ALL COUNTRIES.. Is forcing kids to learn how to program.  This is fascinating. I was about 10 when I started programming.  I just wish that my programming education continued through to today.

I learned on BASIC. I love BASIC.  Realistically.. I am sure these kids will be learning Python.. Not BASIC.

I wish that there were more programming offerings everywhere. Not even Tacoma Community College has any REAL options for me to expand my programming knowledge.

I firmly believe that EVERY SINGLE PERSON IN THE WORLD should know how to program.  Maybe I am egocentric in this belief. I don’t care.

Most people look down on programming as something nerds do. Maybe through mandatory conscription of everyone.. Will help to get MOST PEOPLE better in tune with the needs of their employers.

Not so fast, NoSQL — SQL still reigns

Glad to see MSSQL is still the most popular.

I think that most projects..  SQL Server Express is the best choice.. MySql just can’t compete with the indexing options available with free MSSQL.

When it comes to which database technologies are currently in use at the participating companies, the results featured established commercial products like Microsoft SQL Server (57 percent) and Oracle (38 percent), as well as open source options like MySQL database (40 percent) and PostgreSQL (13 percent).

Small Business Email Provider – need recommendations

So I have decided to change my domain.. My email address. I am trying to increase the value of my brand.

I think I have decided on TWO URLs. I will share them soon.

One half of my career.. To focus on databases, mostly SQL Server, MySQL and Hadoop. 

The other half of my career.. Has primarily been a hobby of mine.  I love WordPress. I just am fascinated by the productivity gains that are realized by standardizing on the worlds most popular platform.

I will be sharing these domain names.. And moving my blogs soon.  I am excited to peer into the future and plan for this.

The ONE major sticking point in my plan. EMAIL.

I just can’t stand paying for email service.. But I REALLY need to get some productivity solution that includes a calendar.  Any ideas of where to turn? I need a custom domain name. And calendar. And FREE.

MOST importantly. I need to support multiple accounts.  I want to segregate my sales leads from existing clients communication. I want to be able to close one inbox.. And get 8 hours of work done. Without hearing from people about new opportunities.

Currently, I have something like a hundred thousand emails in gmail.  This makes it hard to find things.

If you have any advice on where to look for custom domains.. Hosted email, multiple accounts.. And zero cost.. Currently the best option I can see.. Is GoDaddy Virtual Private Servers. They have not one.. But THREE webmail providers available automatically with each of my websites. I didn’t have to lift a finger.  I just  apprehensive about how this will work.. Once I start doing more broadcast emails..

I think that Broadcast Emails are a critical option for MOST companies.. But most people are too scared to do it. I love getting email reminders.. I love getting automatic email reminders. And scheduled emails from clients and systems.

I just can’t BELEIVE that Google and Outlook / Microsoft don’t have lower priced email packages for businesses. I think it is called collusion.. When two giants copy each other’s actions… And stifle innovation.

What other options to I have? Yahoo?

ROFL.  As if.

Google Voice – Multiple Numbers – anyone use this ?

Hey does anyone I know use Google Voice in a business environment.. In order to screen calls? If so, I would love to talk (Seattle area code) 934 9333.

I think I want two Google voice numbers.. One for personal and one for business.. And be able to screen calls from new clients.. So that I don’t get interrupted during the workday.

It seems like such a nice feature. But from this article.. It sounds like there are MAJOR problems in separating traffic between the two numbers.

I guess I need to have a bunch of different Google Account.. GMail accounts. Its just such a shame that Google had to kill their free products.. In order to CON people into using their Office Software.   Seems completely worthy of antitrust investigations.

They got people hooked onto Gmail (with a custom domain) and then bundled that with Google Documents.

I have always loved MS Office.. And its a major shame that Google AND Microsoft killed their free offerings… So that they could charge people fifteen bucks a month for email service.

It blows my mind. That nobody can offer decent email with custom domains.. For a dollar per user per month.

I honesty. Sincerely. Am going to start relying on my GoDaddy Virtual Private Server to host my own email services.  Seems ridiculous to me. But it is the only path forward.

VersionPress is VERY important. And NOT getting enough funding

If anyone wants to contribute to a VERY good open source project.. I really think that VersionPress would be a great candidate.  I love WordPress. It makes web development fun again.. Every day I find a plugin that seems like a FANTASTIC feature.

The one weak point in most WordPress projects has got to be version control.  There is nothing more frustrating than a poorly written plugin or theme.

I really think that WordPress NEEDS automatic version control .. Sure would love to have this feature.

I hope that they can come up with the funding for this project in time.

Why doesn't Facebook allow me to save DRAFTS before I post them?

I write a lot. Sorry. It happens.   I am frequently more verbose than necessary.

I just for the life of me.. Can’t believe that FACEBOOK doesn’t have the decency. to allow me to save a DRAFT..  When I am halfway done with a thought.. I *SHOULD* be able to just ‘SAVE AS DRAFT’.

Come on Facebook.. what years is this?

Is it REALLY too complex to offer drafts?



Microsoft Paying Bloggers To Write About Internet Explorer | Uncrunched

I think it is so sad that people give Microsoft such a hard time.

I honestly love Internet Explorer. I don’t see anyone building easy Chrome automation tools using ActiveX control… The ability to push IE via Visual Basic means I have always cared less about other browsers.

I just can’t fathom that a company the size of Microsoft has short-mans complex…

I think that Internet Explorer is easily the best browser.

With Chrome and Firefox.. I am always having issues.. For example on one machine, I install a plugin.. That LOOKS credible.. But the next thing I know my browser’s homepage is set to conduit.  Then these other browsers have the audacity to push that setting to all of my other profiles.. So one stupid install of a browser plugin for Chrome… And then next thing I know I have a dozen infected machines.

I only ever have this problem with Chrome and Firefox.

Long live Microsoft and long live Internet Explorer.

PS – bring back VB6!!

Strategies for Managing Spreadmarts

Business users are empowered by knowledge—and knowledge comes, in part, from having access to accurate and timely information. It is generally up to the information technology (IT) department to supply this information. But it doesn’t always work out that way.

Definition of a Spreadmart. TDWI used the following definition of a spreadmart in the survey it conducted as part of this report:

  • spreadmart is a reporting or analysis system running on a desktop database (e.g., spreadsheet, Access database, or dashboard) that is created and maintained by an individual or group that performs all the tasks normally done by a data mart or data warehouse, such as extracting, transforming, and formatting data as well as defining metrics, submitting queries, and formatting and publishing reports to others. Also known as data shadow systems, human data warehouses, or IT shadow systems.

In organizations all over the world, business people bypass their IT groups to get data from spreadmarts. Spreadmarts are data shadow systems in which individuals collect and massage data on an ongoing basis to support their information requirements or those of their immediate workgroup. These shadow systems, which are usually built on spreadsheets, exist outside of approved, IT-managed corporate data repositories, such as data warehouses, data marts, or ERP systems, and contain data and logic that often conflict with corporate data. Once created, these systems spread throughout an organization like pernicious vines, strangling any chance for information consistency and reliability. You’ll find them in all industries, supporting all business functions. According to TDWI Research, more than 90% of all organizations have spreadmarts. (See Figure 1.)

Does your group have any spreadmarts?

Strategies for Managing Spreadmarts

Spreadmarts often lead to the phenomenon of duelingspreadsheets. Murray Trim, a management accountantwith Foodstuffs South Island Limited, described one suchsituation: “We have had the classic situation of two peoplepresenting ostensibly the same data at a board meeting withdifferent figures, which they got from different spreadmarts.”Donna Welch, a BI consultant at financial holding companyBB&T, talks about the issues of trust that arise from duelingspreadsheets: “We constantly hear our users talk aboutmanagement’s distrust of their reports because multiple peoplecame up with different answers.”

Who and Why. Spreadmarts are usually created by business analysts and power users who have been tasked to create custom reports, analyses, plans, benchmarks, budgets, or forecasts. Often, these analysts—especially those in the finance department and the financial services industry—have become proficient with Microsoft Excel or Microsoft Access and prefer to use those tools to create reports and analyses. As a result, most are reluctant to adopt a new corporate reporting “standard,” which they believe will limit their effectiveness. Change comes hard, especially when it means learning a new toolset and adapting to new definitions for key entities, calculations, or metrics. Executives perpetuate the problem because they don’t want to pay hundreds of thousands of dollars or more to build a robust data infrastructure and deploy enterprise reporting and analysis tools. Instead, spreadmarts proliferate.

Dangers of Spreadmarts

Inconsistent Views. The problem with spreadmarts is that their creators use different data sources, calculations, calendars, data conversions, naming conventions, and filters to generate reports and analyses based on their view of the business. The marketing department views customers and sales one way, while the finance department views them another way. The way the business operates in Germany is different from the way it operates in Brazil. Business units sell the same products with different names, packaging, pricing, and partner channels. When each group manages its own data and processes, it’s nearly impossible to deliver a consistent, enterprise view of customers, products, sales, profits, and so on. These parochial silos of data undermine cross-departmental and business unit synergies and economies of scale.

Excessive Time. In addition, business analysts spend two days a week—or almost half their time—creating spreadmarts, costing organizations $780,000 a year! Instead of analyzing data, these high-priced employees act like surrogate information systems professionals, gathering, massaging, and integrating data. Many executives have initiated BI projects simply to offload these time-consuming data management tasks from analysts.

Increased Risk. In addition, spreadmarts are precarious information systems. Because they are created by business users, not information management professionals, they often lack systems rigor. The problems are numerous:

  • Users often enter data into spreadmarts by hand, which leads to errors that often go undetected.
  • Few spreadmarts scale beyond a small workgroup.
  • Users may create poorly constructed queries, resulting in incorrect data.
  • Spreadmarts may generate system and data errors when they are linked to upstream systems or files that change without notice.
  • Users embed logic in complex macros and hidden worksheets that few people understand but nevertheless copy when creating new applications, potentially leading to unreliable data.
  • There is no audit trail that tracks who changed what data or when to ensure adequate control and compliance.

In short, spreadmarts expose organizations to significant risk. Business people may make decisions based on faulty data, establish plans using assumptions based on incorrect analyses, and increase the possibility of fraud and theft of key corporate data assets.

Not All Bad?

No Alternative. Despite these problems, there is often no acceptable alternative to spreadmarts. For example, the data that people need to do their jobs might not exist in a data warehouse or data mart, so individuals need to source, enter, and combine the data themselves to get the information. The organization’s BI tools may not support the types of complex analysis, forecasting, or modeling that business analysts need to perform, or they may not display data in the format that executives desire. Some organizations may not have an IT staff or a data management infrastructure, which leaves users to fend entirely for themselves with whatever tools are available.

As such, spreadmarts often fill a business requirement for information that IT cannot support in a timely, cost-effective manner. Spreadmarts give business people a short-term fix for information that they need to close a deal, develop a new plan, monitor a key process, manage a budget, fulfill a customer requirement, and so on. Ultimately, spreadmarts are a palpable instantiation of a business requirement. IT needs to embrace what the business is communicating in practice, if not in words, and take the appropriate action. Thus, spreadmarts should not be an entirely pejorative term.

Cheap, Quick, Easy. Moreover, since spreadmarts are based on readily available desktop tools, they are cheap and quick to build. Within a day or two, a savvy business analyst can prototype, if not complete, an application that is 100% tailored to the task at hand. Although the spreadmart may not be pretty or “permitted,” it does the job. And it may be better than the alternative—waiting weeks or months for IT to develop an application that often doesn’t quite meet the need and that costs more than executives or managers want to pay.

Nevertheless, there is a high price to pay for these benefits in the long term. Many executives have recognized the dangers of spreadmarts and made significant investments to fix this problem. However, not all have succeeded. In fact, most struggle to deliver a robust data delivery environment that weans users and groups off spreadmarts and delivers a single version of truth.


Managed BI Environment. The problem with spreadmarts is not the technology used to create them. Spreadsheets and other desktop-oriented tools are an important part of any organization’s technology portfolio. The problem arises when individuals use these tools as data management systems to collect, transform, and house corporate data for decision making, planning and process integration, and monitoring. When this happens, spreadmarts proliferate, undermining data consistency and heightening risk.

The technical remedy for spreadmarts is to manage and store data and logic centrally in a uniform, consistent fashion and then let individuals access this data using their tools of choice. In other words, the presentation layer should be separated from the logic and data. When this is done, business users can still access and manipulate data for reporting and analysis purposes, but they do not create new data or logic for enterprise consumption. At TDWI, we call this a managed business intelligence environment. The goal is to transform spreadmarts into managed spreadsheets. This lets IT do what it does best—collect, integrate, and validate data and rules—and lets business analysts do what they do best—analyze data, identify trends, create plans, and recommend decisions.

BI vendors are starting to offer more robust integration between their platforms and Microsoft Office tools. Today, the best integration occurs between Excel and OLAP databases, where users get all the benefits of Excel without compromising data integrity or consistency, since data and logic are stored centrally. But more needs to be done.

Change Management. Applying the right mix of technology to address the spreadmart problem is the easy part. The hard part is changing habits, perceptions, behaviors, processes, and systems. People don’t change on their own, especially when they’ve been successful with a certain set of tools and processes for analyzing data and making decisions. Changing a spreadmart-dependent culture usually requires top executives to both communicate the importance of having unified, consistent, enterprise data, and to apply incentives and penalties to drive the right behaviors. Ultimately, change takes time, sometimes a generation or two, but the right organizational levers can speed up the process.

Aligning Business and IT. Another dynamic driving spreadmarts is the lack of communication and trust between business and IT. The business doesn’t adhere to the architectural standards and processes designed to support its long-term interests, while IT doesn’t move fast enough to meet business needs. To reverse this dynamic, both business and IT must recognize each other’s strengths and weaknesses and learn to work together for the common good. IT must learn to develop agile information systems that adapt quickly to changing business conditions and requirements. The business must recognize the importance of building sustainable, scalable solutions. IT must learn about the business and speak its language, while the business must not blame IT for failures when it continually underfunds, overrides, and hamstrings IT so that it cannot possibly serve business needs.

Recognizing that you have a spreadmart problem is the first step. Most of the people we surveyed know their organizations have spreadmarts, but they don’t know what to do about them.

The survey presented respondents with nine different approaches to addressing the spreadmart issue. (See Table 1.)

What strategies have you employed to remedy the problems caused by spreadmarts, and how effective were they?

Strategies for Managing Spreadmarts

Table 1. Respondents could select more than one response.

Ironically, the most common approach that organizations use is simply to leave the spreadmarts alone. But as with everything else in life, ignoring a problem does not make it go away, and often makes it worse. When asked how effective this approach was, a majority (58%) said “not very effective.”

Replace with BI Tools. The next most popular approach is to “provide a more robust BI/DW solution,” employed by almost two-thirds of respondents (63%). This approach was considered “very effective” by 24% of respondents. BI software has progressed from best-in-class niche products to BI platforms that provide integrated reporting, analysis, visualization, and dashboarding capabilities within a single, integrated architecture. In addition, many BI vendors now offer planning, budgeting, and consolidation applications to supplement their BI offerings.

We recommend caution with these BI replacement approaches. First, don’t assume that business users will find the BI tools easy to use. Second, don’t assume that business users will see the benefit of these systems if their spreadmarts are answering their business questions today. Get business users (not just power users) involved in the selection and implementation of BI tools, provide ongoing training, and market the benefits. “If it ain’t broke, don’t fix it”—if the business users are not committed to using the BI tools, walk away from the project and look for other spreadmarts the business perceives as a problem.

Create a Standard Set of Reports. Almost as many companies (58%) assumed that creating a standard set of reports using their standard BI tools would eliminate the need for spreadmarts as those that implemented new BI tools (63%). Organizations assumed that these reports would become their systems of record for decision making. Only 18% found this approach very effective. The most likely reasons for the shortcoming were, first, that no set of reports will effectively cover every management decision, so there was a gap in what was provided. Second, since this approach burdened IT with a queue of reports to develop, the business faced two of the primary reasons spreadmarts were created initially: the IT group did not understand what the business needed, and the IT group was not responsive to business needs.

Excel Integration. The only approach respondents rated more effective than adopting BI tools was “providing BI tools that integrate with Excel/Office” (29%). For a spreadmart user, the next best thing to Excel is Excel that integrates with the corporate BI standard. This approach was used by slightly more than half of the respondents (53%). However, Office integration technology can also provide users more fuel to proliferate spreadmarts if it enables users to save data locally and disseminate the results to users. Some BI vendors—and ironically, Microsoft is one of them—now provide a thinclient Excel solution where administrators can deny users the ability to download or manipulate data.

Some experts claim that power users use BI tools mainly as a personalized extract tool to dump data into Excel, where they perform their real work. According to our survey, that’s not the case. Only a small percentage (7%) of spreadmarts obtain data this way. More than half of spreadmarts (51%) use manual data entry or manual data import. It follows that a major way to drain the life out of spreadmarts is to begin collecting the data they use in a data warehouse and create standard reports that run against that data. Of course, if there are no operational systems capturing this data, then a spreadmart is the only alternative.

Sometimes strong-arm tactics are effective in addressing spreadmarts. Reassigning the creators of spreadmarts to other activities is certainly effective, if an executive has the clout to carry this out and offers a suitable BI/DW replacement system. For example, the director of operations at a major national bank reassigned 58 people who were creating ad hoc performance reports with a set of standard reports created using a standard BI platform, saving $300 million a year and dramatically improving the bank’s quality and efficiency in industry benchmarks. This may be the dream of those who are hostile to spreadmarts, but the survey illustrates that this is a rare occurrence.

Gentler approaches are seldom very effective. New policies for the proper use of spreadsheets generally fall on deaf ears; they are very effective only 12% of the time. The problem isn’t that business people do not know how to use the spreadsheets, but that they think they have no alternative.

Multiple Solutions. Given the low percentage of respondents who can vouch for the effectiveness of any of the approaches listed in Table 1, it’s not surprising that managing the proliferation of spreadmarts is such a difficult task. It is more of a change management issue than a technological one. While it’s important to bring new technologies to bear, such as BI tools that integrate with Excel, it’s critical to figure out which levers to push and pull to change people’s habits and perceptions. No single approach is effective on its own; therefore, organizations must apply multiple approaches.


Spreadsheets are here to stay. Business users have them, are familiar with them, and will use them to do their jobs for years to come. Memo to IT: Deal with it! Our recommendation is to choose a solution that balances business and IT priorities and yields the greatest business value.

SSIS: Work Flow vs Stored Procedures

When importing a file into a SQL table, we create a Work Flow. But for transferring data from one SQL Server table to another SQL Server table, is it better to use Execute SQL Task (Stored Procedures) or Work Flow?

This is a classic debate in SSIS. A lot of times in data warehousing we need to transfer data from the staging tables to the fact and dimension tables. Should we use SQL Task or Work Flow?

There are 4 main considerations here:

  1. Data quality checking
  2. ETL framework
  3. Performance
  4. Development time

Data Quality Checking

There are a number of data quality checks that we need to perform on the incoming data and log them accordingly, potentially rejecting the incoming data. For example data type validations, number of columns, whether the data is within a certain allowable range or conforming to a certain list, etc. These DQ checks should be built only once and used many times, avoiding redundant work. For that purpose, it is easier to build the DQ checks in the form of stored procedures, running dynamic SQLs on many staging tables tables one by one. One of the main principle in DQ is that any silly data in the incoming data should not fail the data load. It should be gracefully recorded and the whole ETL package carries on. It is of an order of magnitude more difficult to build the DQ routines as script tasks, which are executed before the data flows into the warehouse. On the other hand, the data profiles are easier to be built using Data Profiling task. What I’m saying is that the decision whether to use a data flow or stored procedure/execute SQL task is affected by how the DQ routines were built.

ETL Framework

In every data warehousing or data integration project that uses SSIS as the ETL tool, the first step is to build an ETL framework. This framework handles error checking, alert notification, task failures, logging, execution history, file archiving and batch control. It is built as “parent child” package system, supported by a series of ETL metadata tables, as per chapter 10 of my book, e.g. data flow table, package table and status table. What I’m saying here is that the decision of whether to use a data flow or stored procedures/execute SQL task is affected by your ETL framework. I know that it should be the other way around: the ETL framework should be built to incorporate both the workflow and the stored procedures. Well if that’s the case in your project that is excellent, there’s no problem here. But practically speak I’ve seen several cases where we could not implement a data transfer routine as a workflow because the the ETL framework dictates that they need to be implemented as a stored procedures.

The next 2 points are the guts of the reasons. They are the real reasons for choosing between work flow approach and stored procedures, if it is a green field. Meaning that you have a complete freedom to choose, without any of the existing corporate rules/architecture affecting your decision.


Performance is about how fast the data load is. Given the same amount data to load from the staging table into the main table, which one is the fastest method, using select insert, or using a data flow? Generally speaking, if the data is less than 10k rows, there’s no real difference in performance. It is how complicated your DQ stuff that slows it down, not whether it’s a workflow or a stored procedure. If you are lucky enough to be involved in a project that loads billions of rows every day, you should be using work flow. Generally it is faster than stored procedure. The main issue with a stored procedure to do 1 billion upsert in SQL Server database is the bottleneck on the tempDB and log files. Your DBA wouldn’t be happy if you blew up the tempDB from a nice 2 GB to 200 GB. Ditto with log files.

Using workflow you can split a derived column transformation into several transformations, effectively boosting the throughput up to twice faster. See here for details from SQLCat team. And this principle is applicable for any synchronous task, including data conversion transform, lookup, row count, copy column and multicast. See here for an explanation about sync vs async tasks. One thing that gives us the most performance gain is to use multiple workflow to read different partitions of the source table simultaneously. This is for sure will create a bottleneck on the target, so it too needs to be partitioned, pretty much the same way as the source table. The other thing that increases the performance is the use of cache on lookup transformation. Using Full Cache, the entire lookup table is pulled into memory before the data flow is executed, so that the lookup operation is lightning fast. Using Partial Cache, the cache is built as the rows pass through. When a new row comes in, SSIS searches the cache (memory) for a match. Only if it doesn’t find then it fetches the data from disk. See here for details. You don’t get all these when you use stored procedures to transfer the data.

Development Time

You may say that development time is inferior compared to performance, when it comes to how big it influences the decision between work flow and SP. But in reality this factor is significant. I have seen several cases where the ETL developer is more convenient coding in Transact SQL than using SSIS transformations. They are probably twice as fast building it in stored procedures than doing it in SSIS transformations, due to their past experience. Understandably, this is because the majority of the so called “SSIS developer” was a “SQL developer”. They may have been doing SSIS for 2 years, but they have been doing SQL stored procedures for 10 years. For example, many developers are more conversant doing date conversion in Transact SQL than in Derived Column.


If you are lucky enough to be able to choose freely, work flow gives more performance and flexibility. But as with everything else in the real world, there are other factors which tie your hands, e.g. the data quality checking, the ETL framework and the development time.

As always I’d be glad to receive your comments and discussion at Vincent 27/2/11.