This post is authored by Miguel Fierro, Data Scientist, Mathew Salvaris, Data Scientist, Guolin Ke, Associate Researcher, and Tao Wu, Principal Data Science Manager, all at Microsoft.
Boosted decision trees are responsible for more than half of the winning solutions in machine learning challenges hosted at Kaggle, according to KDNuggets. In addition to superior performance, these algorithms have practical appeal as they require minimal tuning. In this post, we evaluate two popular tree boosting software packages: XGBoost and LightGBM, including their GPU implementations. Our results, based on tests on six datasets, are summarized as follows:
XGBoost and LightGBM achieve similar accuracy metrics. LightGBM has lower training time than XGBoost and its histogram-based variant, XGBoost hist, for all test datasets, on both CPU and GPU implementations. The training time difference between the two libraries depends on the dataset, and can be as big as 25 times. XGBoost GPU implementation does not scale well to large datasets and ran out of memory in half of the tests. XGBoost hist may be significantly slower than the original XGBoost when feature dimensionality is high.
All our code is open-source and can be found in this repo. We will explain the algorithms behind these libraries
Source: https://powerbi.microsoft.com/en-us/blog/july-27-webinar-what-s-new-and-exciting-with-common-data-service/ On July 27, join the Common Data Service team in a webinar to learn what’s new and exciting with CDS.
Source: https://powerbi.microsoft.com/en-us/blog/happy-second-birthday-to-power-bi/ Today, Power BI is two years old! When we released the new Power BI service, Power BI Desktop, and Power BI Mobile on July 24, 2015, we knew we had an innovative business READ MORE
Source: https://powerbi.microsoft.com/en-us/blog/explore-your-partner-center-analytics-data-in-power-bi/ We’re pleased to announce the public preview of the new Partner Center Analytics App for Power BI for direct partners. Get a visual representation of your business data with Partner Center…
Source: https://blogs.microsoft.com/iot/2017/07/20/roadside-assistance-provider-the-rac-revolutionizes-customer-service-with-iot/ Imagine a world in which you’re alerted before you even leave the house that your vehicle might break down. You obtain driver recommendations to improve safety and save fuel based on your actual driving patterns. And READ MORE
Source: https://blogs.msdn.microsoft.com/azuredatalake/2017/07/19/analyze-data-in-azure-data-lake-store-using-familiar-and-powerful-excel-2016/ We are excited to announce that as part of the June 2017 updates of Excel 2016, Azure Data Lake Store is now supported as a source of data. Sophisticated and powerful tools like Excel and Power BI READ MORE
This post was authored by Meet Bhagdev, Program Manager, Microsoft
We are excited to announce the Production Ready release for the Microsoft Drivers v4.3.0 for PHP for SQL Server. The drivers now support Debian Jessie and macOS. The driver enables access to SQL Server, Azure SQL Database, and Azure SQL DW from any PHP application on Linux, Windows, and macOS.
Notable items for the release:
Added PHP 7.1 support Added Unicode Column name support (issue #138) Support for transparent connections to AlwaysOn Availability Groups. The driver quickly discovers the current AlwaysOn topology of your server infrastructure and connects to the current active server transparently. Added support for sql_variant data type with limitation (issue #51 and issue #127) Debian Jessie (tested on Debian 8.7) and macOS (El Capitan and above) support Connection Resiliency support for Windows Connection pooling support for Linux and macOS Fixed Fixed the assertion error (Linux) when fetching data from a binary column using the binary encoding (issue #226) Fixed PECL installation errors when PHP was installed from source (issue #213) Fixed issue output parameters bound to empty string (issue #182) Fixed a memory leak in closing connection resources Fixed load ordering issue in MacOS (issue
This post was authored by Tony Petrossian, Partner Group Program Manager, Database Systems Group
SQL Server 2017 will bring with it support for the Linux OS and containers running on Windows, Linux, and macOS. Our goal is to enable SQL Server to run in modern IT infrastructure in any public or private cloud.
With support for containers, SQL Server can now be used in many popular DevOps scenarios. Developers working with Continuous Integration/Continuous Deployment (CI/CD) pipelines can now include SQL Server 2017 containers as a component of their applications for an integrated build, test, and deploy experience.
CI/CD automation with containers – Using containers greatly simplifies the development, testing, and deployment of applications. This is achieved by the packaging of all dependencies, including SQL Server, into a portable, executable environment that reduces variability and increases the speed of every iteration in the CI/CD pipeline. This also enforces a consistent experience for all participants since they can share the same state of an application in their containers. Developers can improve applications in their local environments during the first part of the Continuous Integration process.
The development process starts by taking a container that represents the current state of a production application,
We are pleased to announce the first public release candidate for SQL Server 2017, Release Candidate 1 (RC1), available for download now. This means that development work for the new version of SQL Server is complete along most dimensions to bring the industry-leading performance and security of SQL Server to Windows, Linux, and Docker containers. In our seven community technology previews (CTPs) to date, SQL Server 2017 has delivered:
Linux support for tier-1, mission-critical workloads – SQL Server 2017 support for Linux includes the same high availability solutions on Linux as Windows Server, including Always On availability groups integrated with Linux native clustering solutions like Pacemaker. Graph data processing in SQL Server – With the graph data features available in SQL Server 2017 and Azure SQL Database, customers can create nodes and edges, and discover complex and many-to-many relationships. Adaptive query processing – Adaptive query processing is a family of features in SQL Server 2017 that automatically keeps database queries running as efficiently as possible without requiring additional tuning from database administrators. In addition to the capability to adjust batch mode memory grants, the feature set includes batch mode adaptive joins and interleaved execution capabilities. Python integration for advanced analytics
Source: https://blogs.msdn.microsoft.com/azuredatalake/2017/07/14/azure-data-lake-tools-for-visual-studio-code-vscode-july-updates/ We are pleased to announce the July updates of Azure Data Lake Tools for VSCode. This is a quality milestone and we added local debug capability for C# code behind for window users, refined Azure Data Lake READ MORE