Tuesday, April 29, 2008

What's new in week 18, 2008

Tuesday, 2008-04-29, Copenhagen

Teradata
has recently put new members of its data warehouse platform family into the market. The new members, including Teradata 550 SMP, 2500SMP, and 5550 SMP, are all focused on SMP architecture. Apparently, Teradata’s parallel processing power has been enhanced by the new members.

I always like to spread information about free software, especially free security tools. Here is a link (http://www.eweek.com/c/a/Security/Ten-Free-MustHave-Security-Tools/) of the ten must-have tools, Secunia Personal Software Inspector, OpenDNS, Haute Secure, Trend Micro RUBotted, AVG Anti-Rootkit, ZoneAlarm Firewall, BitDefender 10 Free AntiVirus, CC Cleaner, WinPatrol, NoScript.

Wednesday, April 23, 2008

Reading notes for "Universal Metadata Models" Part I

This is the second book by David Marco on metadata management in large enterprises. Michael Jennings is the second author of this book.



Part I, Presenting the Managed Metadata Environment

First, a managed metadata environment (MME) is far from being just a repository of metadata or a data warehouse. Building a data warehouse for metadata is much more difficult than a data warehouse for normal data because of the various data sources and the difficulty of maintaining the link among the data.

There are companies trying to do point-to-point integration with EAI tools or just using XML. What can be problematic with such approaches is that, when the company grows too large, the effort to maintaining the integration will very easily over the whole IT budget. So, it’s better to start thinking of building an MME as early as you can.

It’s quite inevitable that, when thinking of building MME from an architectural point of view, there are architecture elements that are much similar to what data warehouse architecture includes. Specifically, they are, sourcing layer, integration layer, repository, data management layer, data marts, and data delivery layer.

Similar to the development of a data warehouse, the governance and stewardship models of metadata are also important.


What can we achieve when we have a good MME? If we look at the SE-CMM for data warehousing (CMM is always something big enterprise is focused on and it indeed can help these enterprises to improve the business by reducing cost and promoting new business opportunities), most global 2000 companies are at levels 1 or 2. By having a good MME, it is possible to move up to level 3 and level 4 (it’s called “world class”). And of course, level 5 (continuously improving) has never been achieved by anyone yet.



Monday, April 21, 2008

What's new in week 17, 2008

Monday, 2008-04-21, Copenhagen

Are solution providers being phased out due to the economy? According to a recent news, that IBM solution providers in UK has diminished from 10-12 big ones into 2 to 3 left in the market, it looks like that the market is decreasing for the VARs. Well, you can also vision that as consolidation in the techonomy (technology-economy) market. In a down period of economy, consolidation is a good idea for small IT fishes to survive under the wings of bigger ones. As we will surely witness, those small fishes will come out again in a even higher amount when the economy growth is coming back.

Wednesday, 2008-04-23, Copenhagen

It seems that Microsoft is serious about the next generation of web (web 2.0). And one of the most brilliant ideas that MS thinks for web 2.0 is mesh-up. In the Web 2.0 conference, MS is launching its Live Mesh Synchronization solution to the developers that are targeting next generation of IT applications.

Free music! Yes, for Nokia customers. Nokia had a deal with Sony BMG so that all buyers of Nokia’s certain music phones will be able to download the music for free for the next 12-month. And these users are welcome to keep the music. It seems that media products, such as songs, music, video shows, have the trend to be free for public and of course, there are commercial meanings behind the trends. Can we see a better future for P2P media-sharing tools? I would agree.

Wednesday, April 16, 2008

What's new in week 16, 2008

Tuesday, 2008-04-15, Copenhagen

I just read an article by Richard Winter, regarding the large growth of data volume in enterprise data warehouses. Here are my learning points on the article. Why is it that many enterprise vision bigger growth of data volume in the near future? Mostly this is due to the business needs and the better ways (that technology has brought out) to measure, collect, and calculate the data from its most detailed level. For example, most enterprises use to retain 3 years of historical data in the hard disk for doing analysis. Now there are more ones trying to keep 7 years of historical data to be more accurate in the business competition. So the needs for better hardware will never stop because human beings are becoming more ambitions.

Google recently announced the “Google Solutions Marketplace” which is a network that allows customers and partners to find each other and do business. Most products in this market are based on Google’s communication, search, and collaboration products.

Friday, 2008-04-18, Copenhagen

We are getting used to what Google can provide us. But is there another search engine that can be as competitive as Google? Yahoo? Live? or what? If we look at the recent news, that AOL has just acquired Sphere Source, a developer of contextual-search tools to make connections between content from blogs, video, media, photos, and advertisements, it seems that other people are thinking on the same online search market. I can tell that, nor far from the future, AOL will have its new brand of search engine pushed to market with a lot of competitive features.

Is webBI a form of SAAS for BI applications? Well, it depends. WebBI is indeed a way of enabling SAAS and it provides good opportunities for off-shore BI applications and services. But webBI is far from getting adoptted by medium and large enterprises. What stays in-between is the matter of security and regulations.

Friday, April 11, 2008

Reading notes for "Building and Managing Meta Data Repository," Part II

Here comes the reading notes for the rest part of "Building and Managing Metadata Respository."

To find out metadata in an organization, there are two ways, top-down and bottom-up. The top-down approach is more like to be used if a project team has the opportunity to organize and summary all kinds of metadata of an organization, regardless of any existing tools and systems. The bottom-up approach is used if a project team is mainly going to fulfilling the needs of metadata by certain existing software and repositories.

Normally a metadata tool should have certain administrative facilities just like what a database system should have. For example, security, concurrent accesses, change management, validate integrity and consistency, and error recovery.

One should also think of how the metadata tool can accommodate existing standards.

There are two categories of metadata tools, the data integration and repository tool and the data access tool. It’s just like the back-end and front-end of any systems.

The “-ilities” for an architecture of metadata system is quite similar to what it is for a data warehouse. The additional “-ilities” are “customizable” and “open.”

Customizable means the metadata tool must be customized to meet specific business needs. This is quite important for those who use prepackaged metadata solutions. The tool must be able to provide abilities for customization.

Open means the metadata tool must allow sharing of metadata.

It is very difficult to define database naming standards and use it throughout the whole enterprise, even for the Global 2000 ones.

There are many sources of metadata, for example, ETL tool/process, data modeling tools, documents, employees, reporting and OLAP tools, vendor applications, and data quality tools.

One should bare in mind that, metadata repository is just like a data warehouse. So there can be multiple versions of metadata, or slowing changing dimensions.

If a metadata repository is complete, there should be logic, rules, tests and even requirements included and these metadata can be of great help to data quality applications. They can be organized into a “data quality dictionary.”

Sometimes it also makes sense to put the most naive technical metadata, for example, what is the name of the production servers, how many CPUs it has, etc.

What is a meta model? It is the physical data model for the meta data. There can be two types of meta model. The first one is a model that is based on a generic object model. It is like what you can see in the system tables in SQL Server. Having a very generic model means that you do not need to have a very big effort to extend it when new elements should be added (why? Because it is generic!). The second one is just like the normal entity-relationship model. The metadata team can just find out a list of all kinds of metadata and then treat them as different entities or relationships. What happens normally is that most teams start with the ER way of modeling and find out later that the object model makes more sense in the end.

One should also think of what kinds of metadata delivery should be included when developing a metadata system. It is not just the metadata when you get batch data from the source system. There are many sources. The architecture of the metadata delivery is also important.

One interesting and also very useful direction is to think about if we can make metadata repository bi-directional. That means we can have any available entries to input updates to the metadata and the updates will be reflected to other relevant parties in very short time. In addition, if every business user or IT user in the company starts the work by thinking and using metadata, the company’s data management situation will be very excellent.

To give a final hit, metadata is now considered a highly valuable asset for an enterprise. But not everybody knows how to make it real. In fact, just think of it as yet-another data warehouse of your enterprise. The suggestion that I will give, if I have only one sentence left, is to let the people who always possess with a bigger overview and solid education background of data management theory to design it.

Monday, April 7, 2008

Notes for Ch. 3 of "Beyond Software Architecture"

Both marketecture and tarchitecture represent part of the whole picture of a system and they must work together to achieve business objectives. Sometimes what is built from a technical point of view may look quite naïve but it can be a very successful architecture from a business point of view. Just like the example "Boolean flags" in the book, it is a potentially problematic way but it is good for business people to understand.

Normally in the start of a development cycle, everything begins with the problem domain, the technology, and the –bilities that wanted from the business side. The problem domain is where you find the actually requirements, the technology is where you look for proper technical foundations to implement the solution, and the –bilities is where people discuss non-functional requirements and give priorities. Here the priority is very important. People must set it up and agree upon it in the beginning. It is OK to change it later (but not so frequently) but every change will force the team to look back at what they have done and do certain adjustment.

In the software vendors' world, a successful marketecture normally means that you have to look into what the customers will need in the next 18 or 24 months rather than the current. And you always have to maintain this marketecture up-to-a-future-date continuously. Another important thing to do is to make sure that the tarchitecture agrees with the marketecture. Otherwise, there will be a major dis-continuity in the product life-cycles.

In a product development lifecycle, you always have to talk to people and get feedbacks. Sometimes the marketecture is used to talk to business people and the tarchitecture is used for talking with technology-related people or information source. One important thing is that, if there are changes or disagreements on the pictures, there must be a way to maintain the two pictures and let people agree on it.

One important usage of having the marketecture and tarchitecture is that they are the best models to talk to the project team and all the stakeholders. The project team should always unify these two "architectures" if there is any change to any of them. It is important to put the latest of these two models publicly available to the relevant parties of the project.

Normally it makes a good sense to start making the marketecture and tarchitecture using context diagrams. What is a context diagram? As I get from Wiki, "System Context Diagram (SCD) is the highest level view of a system, similar to Block Diagram, showing a (normally software-based) system as a whole and its inputs and outputs from/to external factors. SCDs are a type of Data Flow Diagram, and they should always be produced as DFDs. Context Diagrams show the interactions between a system and other actors with which the system is designed to face." SCD is very helpful in understanding the context of a system.

What's new in week 15, 2008

Monday, 2008-04-07, Copenhagen

HP does have a big ambition in the Enterprise Information Management market. It has just announced a new acquisition on the Australia-based company Tower who has been an Enterprise Content Management vendor. HP has already built a “league” of software vendors circling around enterprise information management. It has been competing with IBM on this market for a while. As I believe, it will not be so long that we can see HP acquire Informatica and other BI software vendors

The Dojo foundation has just released its ver 1.1 of the Dojo toolkit. Dojo is an open source DHTML toolkit designed to enable developers to build dynamic capabilities into web pages and other environments.

Wanna browse YouTube videos directly from your inbox? There are people working on social inbox utilities, such as Xobni, Xoopit, and Yahoo. It seems that Google and Microsoft are joining this line very soon.

As I remember, I wrote sometime ago that there have been so many social networking sites and one way to stay with all your friends from different networks is to have another website that does some kind of “integration” work. Check out “plaxo.com.” This is the site I meant.

Tuesday, 2008-04-08, Copenhagen

Here is a very interesting website listing I would like to share, the 25 great geek sites.
Bluesnews.com, theinquirer.net, betanews.com, artstechnica.com, osnews.com, Beyond3D.com, hardocp.com, techreport.com, anandtech.com, mvktech.net, Silentpcreview.com, guru3d.com, hacknmod.com, tech-forums.net/pc, driverheaven.net/forum.php, hardforum.com, avsforum.com, rockpapershotgun.com, joelonsoftware.com, gamepolitics.com, engadget.com, codinghorror.com, thinkgeek.com, xkcd.com, mikeshardware.co.uk.

Yahoo is preparing on an Ad management platform for customers to buy and sell ads online. Online Ads have been a big trend for bringing new business models to the market. After Google’s success (well, I mean the success of getting so much focus and investment), most websites are seeking new models in the online market. Despite its recent acquisition actions with Microsoft, Yahoo is looking forward to this new piece of cake.

Thursday, 2008-04-10, Copenhagen

Microsoft is also quite open to the open source world. Quite recently, Microsoft has started to make available the technical docs for the protocols built into Office 2007 and Exchange Server 2007. The protocols will let developers understand and be able to write programs using the connection protocols among tools in the MS Office tool suite, e.g., SharePoint 2007, Outlook, etc.

Google is now starting to host web applications, like what Amazon has been doing. It seems that more and more vendors are focusing on cloud computing. When will Microsoft start to do the same thing? I do not believe it, unless MS can acquire Yahoo or MSN has a major strategy change.

Licensed online video? Yes, ModernFeed.com has started to aggregate many licensed content on the web. From a programmer’s point of view, ModernFeed is just collecting the “pointers” to all kinds of video sources, rather than hijacking it. It will be interesting to know how they can make an online video show as smooth as possible without thinking about P2P technologies.

It seems that OOXML, the MS standard, has been approved by ISO. What does it mean for most of the enterprise Office users? Primarily it is about transforming old files into the new format. I would agree that XML is the best tool for standardizing information, but does it take more space on my disk?

Saturday, 2008-04-12, Copenhagen

Many of us watch online videos and we are aware of the license problem. Is it possible to save licensed online videos and browse them later in an offline mode? It's a matter of protection technique. Adobe just annouced the version 1.0 of Adobe Media Player. How can another media player make sense to any users? The answer is that one of the major feature of this media player is that it can let users to download licensed online videos and watch them in offline mode and it is still legal to do so? I think one major effort put into this players, is the way to avoid users to abuse the downloaded video content. No wonder a lot of hackers will try to crack down the tool. So let us just cross our figures and wait-to-see.

Barcelona is available now. I mean, the AMD quad-core Opteron processor. Now it's Intel's turn to make the quad-core or multi-core chipsets.

Here's an interesting list of websites, the 10 sites for cheap flights (may be for US only) (http://www.pcmag.com/slideshow/0,1206,l=226215&s=25306&a=226221,00.asp)
They are: Airfarewatchdog.com, BookingBuddy, Farecase, Hotwire, InsideTrip, Kayak, Mobissimo, SeatGuru, SideStep, Yapta, (and a bonus) Virgin Air Charter.

Friday, April 4, 2008

Reading notes for "Building and Managing Meta Data Repository," Part I

This is wonderful book by David Marco. As Mr. Inmon suggested, one has to come to Mr. Marco's writings if looking for "meta data" education.

Here comes my notes of the first part.

Part 1, Laying the Foundation

One reason to have a metadata system is to keep information flexibility and integrity in an enterprise’s IT system. Another reason is that, due to the fast growth of data volume, many enterprises have to split data from a single server to multiple systems and maybe, get a federation server to help people to use it. In such conditions, it becomes much more important to control the metadata.

Most of the metadata systems nowadays are kind of provider of information, not a monitor. But what makes a metadata system the most powerful is when you are able to, not only get the metadata information, but also modify and manage the modification of metadata.

What can be the ROI of metadata? There are quite a few benefits.

1. Data definition reporting
It is, indeed, a very basic metadata solution and it is somewhat a data dictionary. Normally very experienced people cannot sense the importance of this benefit. But for less-experienced IT people and business users, this is a must-have thing.

2. Data quality tracking
Controlling the data redundancy, accuracy, and completeness is always a good issue.

3. Business user access to metadata
If there is a semantic layer between the IT systems and the business users, it will become quite easy for the business users to understand the data. For example, a business user may get a report but want to know how the values in the columns are calculated. Here the business metadata comes into the play.

4. Impact analysis
If an enterprise has a whole-wide metadata system, it becomes very easy to do impact analysis on most subjects. And if the data is kept at a high quality, the result of the impact analysis will be a very excellent input to decision-making or enterprise analysis.

A good example to understand what metadata is, is the card catalog in the library. Normally, in a data warehouse environment, there are two types of metadata, technical and business. You can look at the target group for these two types. The technical metadata is the metadata that supports technical and IT users. The business metadata supports business users.

Even external data is quite ad hoc and unstable, it is quite important, when external data source is used, to have and maintain the metadata of external data source.

There are majorly three types of users for metadata, business users, technical users, and power users.

In a data life cycle of a data warehouse, there are many parts, or components that can lead to metadata. For example, the ETL tools, the data modeling tools, the reporting tools, and the data quality tools. There are vendors that provide independent systems on metadata management. But such systems look more like a metadata source rather than a solution. When there are third-party applications that are focused on one or a few business areas, such as CRM or ERP systems, the management of metadata may become a bit more complicated. The reason is that these vendors do not want users to manipulate its internal infrastructure (because this may lead users to create own systems other than use theirs).

Metadata of an enterprise comes from two types of sources, structured and unstructured. The structured sources are those that people have discussed, documented and agreed on. They are kept well in tools and documents. On the other hand, much of the most useful information is actually unstructured. They are on a Post-It note or just in some people’s mind (and be assumed as commonsense). Still, such information should be well captured and recorded and managed if possible.

What has not been quite established, even nowadays (8 years after this book is published), is the metadata security issue. In general, there are two ways of metadata security, proactive security that prevents unwanted access before it occurs, and reactive security that use audits to check what has happened.

Meta model is important when it comes to a standard for different tools to interchange information. There used to be MDC (based by Microsoft) and CWM meta models (based by OMG folks) but MDC merged into CWM around 2002. Any since XML has been so popular now, the meta model should also be represented by XML.

Wednesday, April 2, 2008

What's new in week 14, 2008

Wednesday, 2008-04-02, Copenhagen

There has been a rumor that Microsoft is testing an online version of Google Apps Premier Edition (GAPE). It seems that (if this is TRUE), MS is seriously looking at the threats that Google has put to its desktop tool suite like Word, Excel, PowerPoint, etc. And MS is also launching projects (such as Albany) to compete on this line. Is MS becoming Google or the other way around? No doubts that both sides are working on the web 2.0 direction.

Facebook is revolutionizing from time to time. It recently introduced a ‘People you may know’ feature so that it becomes easier to fill out your buddies list and what’s more to find opportunities to re-connect.

Online document collaboration tools, where it started? Perhaps one could say it can be originated from the time when Google acquired Writely, which was a web-hosted word processing tool. Microsoft now has provided a new service called Officer Live Workspace in order to let people collaborate on documents. And there are other players on this front line. Adobe has Buzzword (a flash-based tool), AOL has just acquired Goowy and of course its online workspace tech. ThinkFree online offers an online office tool and there is Zoho.

Thursday, 2008-04-03, Copenhagen

Apple has just been elected as the most influential brand in the world! The poll is brought out by online magazine brandchannel.com. Microsoft and United States nation brand are numbered 2nd and 3rd. It seems that, by iPod and Mac computers, Apple has really been buries in the mind of a majority of people as the coolest and most impressive stuff.

Ops, there is local news. The Danish telecoms operator TDC just announced that it will offer free music downloads to its mobile phones and broadband Internet customers in cooperation with EMI, Warner Music and Song BMG. Anything in TDC's plan but hidden under our tables? Let us just cross our figures and see.

OOXML vs. ODF? ISO is working on it now. Microsoft has pushed quite hard for the certification of Open Office XML (OOXML) so that it can become an international standard supported by ISO, while Sun has already introduced the Open Document Format (ODF) which is ISO-approved. Google actually used the ODF standard in the Google Documents applications. So this becomes yet-another-war between MS and the ‘open world’ led by Sun.

Comcast has made an agreement with BitTorrent that it will re-configure its network management practices so that users of P2P services will not have discriminated services. Comcast admitted that the P2P traffic used to be ‘delayed’ in peak times. P2P services have been believed to be one of the sources for illegal usage of copyright content. But the net neutrality is the major issue and Comcast seems to be doing the right thing now.

Facebook has made a deal with CareerBuilder to start up a campaign of job recruiting. It seems that the US style of job recruiting is going on the Internet as well. Such an idea was actually adopted by LinkedIn for a quite while.

Friday, 2008-04-04, Copenhagen

Microsoft has been talking about LINQ (Language Integrated Query) for a while and it has made quite a big progress towards this target. LINQ actually merges the gap between programming languages and database. With LINQ, all kinds of data happened in a program flow can be queried like what a standard SQL do to a standard DB dataset. But, my question is, are you trying to clear things out or are you trying to confuse people more? My belief is that LINQ will be a generation but just a generation, meaning that it definitely be replaced and people may come back to the time that language should be separated from database (remember what FoxPro and FoxBase did?).

NAC, network access control, is a good idea for people to be able to have full and flexible control of their networks. Right now there are only two vendors in the market, Cisco and Microsoft. Most other competitors are either disappeared or moved to other business scenarios.