24

Mar

Hive Metastore in HDInsight –Tips, Tricks & Best Practices
Hive Metastore in HDInsight –Tips, Tricks & Best Practices

Source: https://blogs.msdn.microsoft.com/azuredatalake/2017/03/24/hive-metastore-in-hdinsight-tips-tricks-best-practices/   When you create a Hive table, the table definition (column names, data types, comments, etc.) are stored in the Hive Metastore. Hive Metastore is critical part of Hadoop architecture as it acts as a central schema repository READ MORE

23

Mar

Announcing R Tools 1.0 for Visual Studio 2015
Announcing R Tools 1.0 for Visual Studio 2015

This post is authored by Shahrokh Mortazavi, Partner Director of Program Management at Microsoft.

I’m delighted to announce the General Availability of R Tools 1.0 for Visual Studio 2015 (RTVS). This release will be shortly followed by R Tools 1.0 for Visual Studio 2017 in early May. RTVS is a free and open source plug-in that turns Visual Studio into a powerful and productive R development environment. Check out this video for a quick tour of its core features:

Core IDE Features

RTVS builds on Visual Studio, which means you get numerous features for free, from using multiple languages to word-class editing and debugging, to over 7,000 extensions for every conceivable need.

A polyglot IDE – VS supports R, Python, C++, C#, Node.js, SQL, etc. projects simultaneously.
Editor – complete editing experience for R scripts and functions, including detachable/tabbed windows, syntax highlighting, and much more.
IntelliSense – aka auto-completion, available in both the editor and the Interactive R window.
R Interactive Window – work with the R console directly from within Visual Studio.
History window – view, search, select previous commands and send to the Interactive window.
Variable Explorer – drill into your R data structures and examine their values.
Plotting – see all your

23

Mar

Easy way to get statistics histogram programmatically
Easy way to get statistics histogram programmatically

Source: https://blogs.msdn.microsoft.com/sql_server_team/easy-way-to-get-statistics-histogram-programmatically/   Statistics being the building blocks on which the Query Optimizer reasons to compile a good enough plan to resolve queries, it’s very common that anyone doing query performance troubleshooting needs to use DBCC SHOW_STATISTICS to understand how READ MORE

23

Mar

Five reasons to run SQL Server 2016 on Windows Server 2016 — No. 1: Security

This is the first blog in a five-part series. Keep an eye out for upcoming posts, which will cover cutting costs and improving performance of storage, BI, and analytics; improving uptime and reliability; reaching data insights faster by running analytics at the point of creation; and maintaining a consistent data environment across on-premises, hybrid, and cloud environments.

Wall, ditch, moat, palisades, watch towers, guards, highly trained soldiers: Even 2,000 years ago, when the Romans built their defenses, they deployed multiple layers of protection to deter invaders and keep intruders out. Today, on the electronic front, IT environments demand no less than a strong, layered approach to ensuring that data assets are protected from attacks such as stolen administrator credentials, unauthorized access, and pass-the-hash exploits.

You can see how important security is by examining the cost of data breaches, which is growing rapidly and represents a significant risk to business, as Figure 1 illustrates. To address this, Microsoft’s $1 billion annual investment in security demonstrates the company’s longstanding and proven commitment to building security capabilities into both its applications and operating systems. This means you can take advantage of layered security and mitigate risk.

Figure 1: Growing cost of data breach

23

Mar

The Power of SharePoint
The Power of SharePoint

Source: https://powerbi.microsoft.com/en-us/blog/the-power-of-sharepoint/           Do you have SharePoint Online and want to better automate and streamline your business processes? Have you heard of PowerApps, Microsoft Flow, or Power BI, but you’re not sure how to use them READ MORE

22

Mar

Retail Customer Churn Prediction: How-To Guide Now Available
Retail Customer Churn Prediction: How-To Guide Now Available

This post is authored by Lixun Zhang, Data Scientist, Daisy Deng, Software Engineer, and Tao Wu, Principal Data Scientist Manager, at Microsoft.

Predicting customer churn rate is among the most sought-after machine learning and analytics applications for retail stores, and of high value to companies that are eager to take advantage of the ever-increasing amounts of customer data they are collecting. Retaining existing customers is estimated to be five times cheaper than the cost of attracting new ones, and so businesses want to be proactive about things and predict who is likely to churn before it happens. Businesses also wish to identify the factors that are related to high churn rates, which in turn helps them apply resources towards acquiring the right type of customers in the first place.

Microsoft has been active in the domain of churn prediction, having published several resources to help businesses understand the data science process behind customer churn prediction.

We are now pleased to announce the Retail Customer Churn Prediction Solution How-to Guide, available in Cortana Intelligence Gallery and a GitHub repository.

What’s the Guide About?

The Guide includes a Solution Overview for Business Audiences and a Technical Deployment Guide that provides the

22

Mar

End-to-End Data Science Walkthrough with Spark 2.0 on Azure HDInsight Hadoop Clusters

This post is authored by Debraj GuhaThakurta, Senior Data Scientist, and Brad Severtson, Senior Content Developer, at Microsoft.

The data scientists among you would have seen how Spark 2.0, which released in July 2016, offered several enhancements over Spark 1.6. These enhancements included:

Easier ANSI SQL and more streamlined APIs.
Improvements in the speeds of data processing and summarization.
Enhancements to Spark’s ML library, including DataFrame -based primary ML APIs.
Easier Rformula-based specification of training formulae, as well as improvements in pipelining and cross-validation features.
ML pipeline persistence.
A new structured streaming API.

Microsoft Azure released Spark 2.0 on HDInsight (Linux) as a service in September 2016. To help users get a jumpstart with using Spark 2.0 on HDInsight for data science and machine learning, we are providing end-to-end data science walkthroughs using Spark 2.0 on HDInsight.

This is an update to the Spark 1.6 -based walkthrough that we published in June 2016, as a part of the Team Data Science Process
documentation. That release contained a comprehensive walkthrough using pySpark and MLlib to demonstrate how to conduct end-to-end data science on Azure HDInsight Spark 1.6 clusters. Using detailed examples and pySpark code, made publicly available from a GitHub

22

Mar

Columnstore Index – How to Estimate Compression Savings
Columnstore Index – How to Estimate Compression Savings

Source: https://blogs.msdn.microsoft.com/sql_server_team/columnstore-index-how-to-estimate-compression-savings/   SQL product team has made significant improvements in columnstore index functionality, supportability and performance during SQL Server 2016 based on the feedback from customers. Please refer to  List of Blogs  for all blogs published by SQL Tiger READ MORE

22

Mar

Custom visuals now available in the Office store
Custom visuals now available in the Office store

Source: https://powerbi.microsoft.com/en-us/blog/custom-visuals-now-available-in-the-office-store/           Custom visuals have come a long way since the platform was announced as part of Power BI GA back in July 2015. From best visual contest, to a new and improved version of READ MORE

22

Mar

The Microsoft Data Insights Summit is back – check out our full session catalog today!

Source: https://powerbi.microsoft.com/en-us/blog/the-microsoft-data-insights-summit-is-back-check-out-our-full-session-catalog-today/           In February, we announced that our Microsoft Data Insights Summit will return for the second year, and invited you to register and join us June 12-13, 2017 in Seattle, WA. This is THE READ MORE