Feed aggregator

what is lobsegment,lobindex

Tom Kyte - Wed, 2024-02-28 02:26
I query table 'user_segments' sql>select segment_name, segment_type, tablespace_name, bytes, max_extents from user_segments where segment_type like 'LOB%' result is SEGMENT_NAME SEGMENT_TYPE TABLESPACE_NAME SYS_IL0000012099C00002$$ LOBINDEX CPOCKET_DATA SYS_LOB0000012099C00002$$ LOBSEGMENT CPOCKET_DATA => I don't know what is lobsegement, lobindex. What is lobsegment, lobindex and why need it. Can't I delete it? Please explain detail about lobsegement, lobindex. Thank you.....
Categories: DBA Blogs

How to do update (replace values) in a table contains 50+ million records?

Tom Kyte - Wed, 2024-02-28 02:26
Hi, I have a table contains 50+ million records, and I am writing a procedure to replace the bad data to the correct values(about 1500 records). <i><b>K_V</b></i> is the array of bad data and target correct value,like <code>K_V('bad data1') := 'correct value1'</code> when I loop the <i><b>K_V</b></i>, do <code>'update table set xx=replace(xx,bad data,correct value);'</code> This procedure run whole night but still can not finish. So how can deal with this problem? Seems I can not write the procedure that way. Thanks.
Categories: DBA Blogs

"alter session sync with primary" with Maximum Performance Protection Mode

Tom Kyte - Wed, 2024-02-28 02:26
"alter session sync with primary" raises ORA-03173 for us. <code>SQL> select database_role, open_mode, db_unique_name from v$database; DATABASE_ROLE OPEN_MODE DB_UNIQUE_NAME ---------------- -------------------- ------------------------------ PHYSICAL STANDBY READ ONLY WITH APPLY mdpams SQL> alter session sync with primary; ERROR: ORA-03173: Standby may not be synced with primary</code> Is this expected behaviour in protection mode "Maximum Performance" or do we maybe hit some bug ? dgmgrl shows nothing suspicious <code>DGMGRL> show configuration Configuration - fsc Protection Mode: MaxPerformance Members: mdpfra - Primary database mdpams - Physical standby database mdpdev - Snapshot standby database Fast-Start Failover: Disabled Configuration Status: SUCCESS (status updated 61 seconds ago) DGMGRL> show database mdpams Database - mdpams Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: 0 seconds (computed 0 seconds ago) Apply Lag: 0 seconds (computed 0 seconds ago) Average Apply Rate: 4.94 MByte/s Real Time Query: ON Instance(s): mdpams1 (apply instance) mdpams2 Database Status: SUCCESS DGMGRL> </code>
Categories: DBA Blogs

Similarity Search with Oracle’s Vector Datatype

DBASolved - Tue, 2024-02-27 09:36

In my last two posts I showed you what the Oracle Vector Datatype is and how to update existing data […]

The post Similarity Search with Oracle’s Vector Datatype appeared first on DBASolved.

Categories: DBA Blogs

View with pivot and group by grouping sets work in 12c but not in 19. Error ORA-56903

Tom Kyte - Tue, 2024-02-27 08:06
Views with group by grouping sets and pivot directly or in referenced view which worjk in Oracle 12c fall with ORA-56903 error sys_op_pivot function is not allowed here in Oracle 19. Bit in view or referenced view don't have explicit call sys_op_pivot. Mybe Oracle use it during execution of views. Thanks in advance. Best regards. According sugestion I have put: alter session set optimizer_features_enable = '' but error persists. It appeare in all view with grouping sets and with pivot clause in subview as base view or directly in current view. If view with pivot clause is subview subview work correctly.
Categories: DBA Blogs

Record / Check Login Information for Standby DBs

Tom Kyte - Tue, 2024-02-27 08:06
Hello We want to housekeep our user accounts and remove unsed and locked accounts. As far as I understand, the information in dba_users is from the primary DB. Users are not allowed to logon to the primary to query data, they must logon the read only standby (regulated by a trigger). When I look in dba_users on the standbys I can see several users that have not or never logged on: <code>select username, account_status, nvl(to_char(last_login),'never logged on') "Last Login" from dba_users where oracle_maintained = 'N' and username not in ('AAAAAAAAAAA','BBBBBB','CCCCCCC') and username not like '%READ%' and username not like '%Exxx%' order by "Last Login" desc;</code> USERNAME ACCOUNT_STATUS Last Login ------------------------------ --------------- ---------------------------------------- Pxxxxxxx OPEN never logged on Pxxxxxxx_03 LOCKED never logged on Pxxxxxxx_05 LOCKED never logged on Pxxxxxxx_04 LOCKED never logged on Pxxxxxxx_01 LOCKED never logged on BRxxxxxxx OPEN never logged on Pxxxxxxx_02 LOCKED never logged on Sxxxxxxx EXPIRED never logged on Jxxxxxxx EXPIRED never logged on Mxxxxxxx OPEN 2020-09-05:19:48:06 GMT+01:00 Bxxxxx OPEN 2020-09-05:19:19:52 GMT+01:00 Axxxxxx OPEN 2016-05-20:09:17:33 GMT+01:00 Pxxxxxxxxxx_01 OPEN 2016-04-21:10:48:34 GMT+01:00 Kxxxxx OPEN 2016-04-19:13:50:33 GMT+01:00 Pxxxxxxxxxx_01 OPEN 2016-04-13:14:18:17 GMT+01:00 However, this information from dba_users is identical on primary and standby DBs. The users told me that they have logged to the standby recently. As far as I understand the information in dba_users, also on the standby has been inherited from the primary as normal catalogue tables are not updated on the standby. Is this correct? How can I see last logins on the standby, preferably witthout using auditing which could cause a performance degredation and this is a production system where performance is key. Many thanks Alison We are using active dataguard, and our idea at the moment is to record logins to the standby using a trigger which checks if standby or primary and then writes logon data acroos a DB link into a table on primary. Many thanks
Categories: DBA Blogs

move table to new tablespace

Tom Kyte - Tue, 2024-02-27 08:06
Hi TOM i have oracle cluster database EE with 2 nodes, and i have a big table with a big LOBs row, so after compressing the LOBs files i did move them to a new tablespace,so the principal table became just about 300MB but it still taking space of 1.2Tb, my concerns are about the space why i cant reclaim this space, i created a new tablespace and i did table move but it doesn't work it took so much time and i did shrink but it doesn't work too, i think there is a problem with high watermark? what i have to do please to gain this space and thanks. The lobs were saved in the same tablespace with other data caled DATA having 1.2 Tb, after that i did move them to a new tablespace i created caled LOB_DATA the problem is the shrink space for table space DATA did nothing and the table move also doesn't work so how to reclaim the free extents in DATA 1.2Tb tablespace.
Categories: DBA Blogs

Capacity Planning

Tom Kyte - Tue, 2024-02-27 08:06
Hi Tom, I have a some questions regarding the capacity planning.Thanks in advance. 1.is there any way we can match LIOs & PIOs to the no of CPUs & no of disks ? 2.is there any place , i can find documents to do capacity planning for the oracle database/sun solaris environment? 3.I am very much confused about the sort_area_size my understanding is -- sort_area_size is the max threshold to do sort on memory and only one only sort_area_size per session .Alloc from UGA --sort_area_retained is to store the result set from SAS and it can be many per session at a time.is it correct? Alloc from PGA. When we do first sorting which is lesser than sort_area_size,the memory allocated from PGA or UGA? is it sort_area_size or sort_area_reatined? Thanks in advance Regards Jeyaseelan.M
Categories: DBA Blogs

Installing and Running Oracle AHF ORACHK on a 12.2 DB Server

Hemant K Chitale - Tue, 2024-02-27 00:20

 The Oracle Autonomous Health Framework is described in Support Document "Autonomous Health Framework (AHF) - Including TFA and ORAchk/EXAchk (Doc ID 2550798.1)"

In a recent video I have demonstrated running 24.1 orachk (with "-b" for "Best Practices Check) against a 21.3 RAC Cluster.

Here I demonstrate the installation and execution against a 12.2 non-RAC database.

When you download the 24.1 release of AHF (AHF-LINUX_v24.1.0.zip, approximately 410MB), you have to unzip it and then run ahf_setup.  It is preferable to use the default location /opt/oracle.ahf  (and precreate a "data" subfolder if it doesn't exist).

If your first attempt at installation returns an error :

[ERROR] : AHF-00074: Required Perl Modules not found :  Data::Dumper

you can check the perl version and download and install this module (Note : In the listings below "AHF_Installer is the location where I have extracted the installation zip file).

[root@vbgeneric AHF_Installer]# /bin/perl -v

This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi
(with 34 registered patches, see perl -V for more detail)

Copyright 1987-2012, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl".  If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.

[root@vbgeneric AHF_Installer]# yum install perl-Data-Dumper
Loaded plugins: langpacks, ulninfo
ol7_UEKR4                                                           | 3.0 kB  00:00:00     
ol7_latest                                                          | 3.6 kB  00:00:00     
(1/5): ol7_latest/x86_64/group_gz                                   | 136 kB  00:00:00     
(2/5): ol7_UEKR4/x86_64/updateinfo                                  | 130 kB  00:00:00     
(3/5): ol7_latest/x86_64/updateinfo                                 | 3.6 MB  00:00:00     
(4/5): ol7_latest/x86_64/primary_db                                 |  50 MB  00:00:02     
(5/5): ol7_UEKR4/x86_64/primary_db                                  |  37 MB  00:00:04     
Resolving Dependencies
--> Running transaction check
---> Package perl-Data-Dumper.x86_64 0:2.145-3.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

 Package                   Arch            Version               Repository           Size
 perl-Data-Dumper          x86_64          2.145-3.el7           ol7_latest           47 k

Transaction Summary
Install  1 Package

Total download size: 47 k
Installed size: 97 k
Is this ok [y/d/N]: y
Downloading packages:
perl-Data-Dumper-2.145-3.el7.x86_64.rpm                             |  47 kB  00:00:00     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Warning: RPMDB altered outside of yum.
  Installing : perl-Data-Dumper-2.145-3.el7.x86_64                                     1/1 
  Verifying  : perl-Data-Dumper-2.145-3.el7.x86_64                                     1/1 

  perl-Data-Dumper.x86_64 0:2.145-3.el7                                                    

[root@vbgeneric AHF_Installer]#

Then resume the installation (precreate the "data" folder if it doesn't exist)

[root@vbgeneric AHF_Installer]# mkdir /opt/oracle.ahf/data
[root@vbgeneric AHF_Installer]# ./ahf_setup

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_241000_6588_2024_02_27-13_48_51.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 24.1.0 Build Date: 202402051317

Default AHF Location : /opt/oracle.ahf

Do you want to install AHF at [/opt/oracle.ahf] ? [Y]|N : Y

AHF Location : /opt/oracle.ahf

AHF Data Directory stores diagnostic collections and metadata.
AHF Data Directory requires at least 5GB (Recommended 10GB) of free space.

Please Enter AHF Data Directory : /opt/oracle.ahf/data

AHF Data Directory : /opt/oracle.ahf/data

Do you want to add AHF Notification Email IDs ? [Y]|N : N

Extracting AHF to /opt/oracle.ahf

Setting up AHF CLI and SDK

Configuring TFA Services

Discovering Nodes and Oracle Resources

Successfully generated certificates.

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

| Host      | Status of TFA | PID  | Port  | Version    | Build ID              |
| vbgeneric | RUNNING       | 8540 | 39049 | | 240100020240205131724 |

Running TFA Inventory...

Adding default users to TFA Access list...

|              Summary of AHF Configuration             |
| Parameter       | Value                               |
| AHF Location    | /opt/oracle.ahf                     |
| TFA Location    | /opt/oracle.ahf/tfa                 |
| Orachk Location | /opt/oracle.ahf/orachk              |
| Data Directory  | /opt/oracle.ahf/data                |
| Repository      | /opt/oracle.ahf/data/repository     |
| Diag Directory  | /opt/oracle.ahf/data/vbgeneric/diag |

Starting ORAchk Scheduler from AHF

AHF binaries are available in /opt/oracle.ahf/bin

AHF is successfully Installed

Do you want AHF to store your My Oracle Support Credentials for Automatic Upload ? Y|[N] : N

Moving /tmp/ahf_install_241000_6588_2024_02_27-13_48_51.log to /opt/oracle.ahf/data/vbgeneric/diag/ahf/

[root@vbgeneric AHF_Installer]# 

orachk can then be executed.  This execution is to check against "Best Practices"  :

[root@vbgeneric AHF_Installer]# orachk -b

List of running databases

1. orcl12c
2. None of above

Select databases from list for checking best practices. For multiple databases, select 1 for All or comma separated number like 1,2 etc [1-2][1]. 1
.  .
.  .  

Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS

.  .  . . . .  
.  .  .  .  .  .  .  .  .  
                                                 Oracle Stack Status                          
  Host Name       CRS Installed       ASM HOME  RDBMS Installed    CRS UP    ASM UP  RDBMS UP    DB Instance Name
  vbgeneric                  No           No          Yes           No       No      Yes             orcl12c

Copying plug-ins

. .
.  .  .  .  .  .  

*** Checking Best Practice Recommendations ( Pass / Warning / Fail ) ***


              Node name - vbgeneric
. . . . . . 
 Collecting - Database Parameters for orcl12c database
 Collecting - Database Undocumented Parameters for orcl12c database
 Collecting - List of active logon and logoff triggers for orcl12c database
 Collecting - CPU Information
 Collecting - Disk I/O Scheduler on Linux
 Collecting - DiskMount Information
 Collecting - Kernel parameters
 Collecting - Maximum number of semaphore sets on system
 Collecting - Maximum number of semaphores on system
 Collecting - Maximum number of semaphores per semaphore set
 Collecting - Memory Information
 Collecting - OS Packages
 Collecting - Operating system release information and kernel version
 Collecting - Patches for RDBMS Home
 Collecting - Patches xml for RDBMS Home
 Collecting - RDBMS patch inventory
 Collecting - Table of file system defaults
 Collecting - number of semaphore operations per semop system call
 Collecting - Database Server Infrastructure Software and Configuration
 Collecting - Disk Information
 Collecting - Root user limits
 Collecting - Verify ORAchk scheduler configuration
 Collecting - Verify TCP Selective Acknowledgement is enabled
 Collecting - Verify no database server kernel out of memory errors
 Collecting - Verify the vm.min_free_kbytes configuration

Data collections completed. Checking best practices on vbgeneric.

 INFO =>     Traditional auditing is enabled in database for orcl12c
 WARNING =>  Linux swap configuration does not meet recommendation
 WARNING =>  Hidden database initialization parameters should not be set per best practice recommendations for orcl12c
 FAIL =>     loopback interface MTU value needs to be set to 16436
 INFO =>     Most recent ADR incidents for /u01/app/oracle/product/12.2/db_1
 FAIL =>     Verify Database Memory Allocation
 INFO =>     Oracle GoldenGate failure prevention best practices
 FAIL =>     The vm.min_free_kbytes configuration is not set as recommended
 INFO =>     user_dump_dest has trace files older than 30 days for orcl12c
 INFO =>     At some times checkpoints are not being completed for orcl12c
 WARNING =>  One or more redo log groups are not multiplexed for orcl12c
 WARNING =>  Primary database is not protected with Data Guard (standby database) for real-time data protection and availability for orcl12c
 INFO =>     Important Storage Minimum Requirements for Grid & Database Homes
 CRITICAL => Operating system hugepages count does not satisfy total SGA requirements
 FAIL =>     Table AUD$[FGA_LOG$] should use Automatic Segment Space Management for orcl12c
 FAIL =>     Database parameter DB_LOST_WRITE_PROTECT is not set to recommended value on orcl12c instance
 INFO =>     umask for RDBMS owner is not set to 0022
 FAIL =>     Database parameter DB_BLOCK_CHECKING on primary is not set to the recommended value. for orcl12c
 INFO =>     Operational Best Practices
 INFO =>     Database Consolidation Best Practices
 INFO =>     Computer failure prevention best practices
 INFO =>     Data corruption prevention best practices
 INFO =>     Logical corruption prevention best practices
 INFO =>     Database/Cluster/Site failure prevention best practices
 INFO =>     Client failover operational best practices
 WARNING =>  Oracle patch 30712670 is not applied on RDBMS_HOME /u01/app/oracle/product/12.2/db_1
 WARNING =>  Oracle patch 29867728 is not applied on RDBMS_HOME /u01/app/oracle/product/12.2/db_1
 WARNING =>  Oracle patch 31142749 is not applied on RDBMS_HOME /u01/app/oracle/product/12.2/db_1
 WARNING =>  Oracle patch 26749785 is not applied on RDBMS_HOME /u01/app/oracle/product/12.2/db_1
 WARNING =>  Oracle patch 29302565 is not applied on RDBMS_HOME /u01/app/oracle/product/12.2/db_1
 WARNING =>  Oracle patch 29259068 is not applied on RDBMS_HOME /u01/app/oracle/product/12.2/db_1
 WARNING =>  Oracle clusterware is not being used
 WARNING =>  RAC Application Cluster is not being used for database high availability on orcl12c instance
 WARNING =>  DISK_ASYNCH_IO is NOT set to recommended value for orcl12c
 WARNING =>  Flashback on PRIMARY is not configured for orcl12c
 INFO =>     Database failure prevention best practices
 WARNING =>  fast_start_mttr_target has NOT been changed from default on orcl12c instance
 FAIL =>     Active Data Guard is not configured for orcl12c
 WARNING =>  Perl Patch 31858212 is not found in RDBMS_HOME. /u01/app/oracle/product/12.2/db_1
 WARNING =>  Oracle patch 31602782 is not applied on RDBMS_HOME /u01/app/oracle/product/12.2/db_1
 WARNING =>  Oracle patch 33121934 is not applied on RDBMS_HOME /u01/app/oracle/product/12.2/db_1
 WARNING =>  Oracle patch 31211220 is not applied on RDBMS_HOME /u01/app/oracle/product/12.2/db_1
 INFO =>     Software maintenance best practices
 INFO =>     Oracle recovery manager(rman) best practices
 INFO =>     Database feature usage statistics for orcl12c
 WARNING =>  Consider investigating changes to the schema objects such as DDLs or new object creation for orcl12c
 WARNING =>  Consider investigating the frequency of SGA resize operations and take corrective action for orcl12c

UPLOAD [if required] - /opt/oracle.ahf/data/vbgeneric/orachk/user_root/output/orachk_vbgeneric_orcl12c_022724_140315.zip

[root@vbgeneric AHF_Installer]# 

Thus, you can actually run the 24.1 orachk against even a 12.2 non-RAC (single instance) database.

The complete report is in HTML format in the final ZIP file.  

Here's the header :

Categories: DBA Blogs

Oracle Vector Datatype – Updating table data

DBASolved - Mon, 2024-02-26 14:36

  In my last blog post on Oracle’s Vector data type, I simply showed you how the datatype is used […]

The post Oracle Vector Datatype – Updating table data appeared first on DBASolved.

Categories: DBA Blogs

Different lists of dependencies

Tom Kyte - Mon, 2024-02-26 13:46
As part of a migration effort, I'm researching dependencies and am confused by the different results displayed by SQL Developer's Dependencies tab versus running something like the following: <code>SELECT * FROM user_dependencies WHERE name = 'USP_COMPANYIMPORT';</code> The former displays 19 rows, whereas the latter displays only 15 rows, including two where the REFERENCED_OWNER is SYS. Q1: Why the difference? Q2: Is it possible to view the code SQL Developer runs to obtain its results? Thank you.
Categories: DBA Blogs


Tom Kyte - Mon, 2024-02-26 13:46
Is Oracle working on the Oracle Database PL/SQL package UTL_HTTP to add support for http_versions: HTTP/2 and HTTP/3?
Categories: DBA Blogs

Dropping and purging table does not release space back to the tablespace

Tom Kyte - Mon, 2024-02-26 13:46
Dear Tom, Oracle 4 node RAC version 19c In my tablespace I have total of 570 partitioned tables that are zero rows. Their initial extent is 8M for each partition, so collectively the empty tables are occupying 2286.03 GB. As they are not needed, I have started to drop them. After dropping some 300 tables, I wanted to check the space released. But this query shows the occupied space is not released. I always thought that if I drop a table with purge, the space would immediately be released back to the tablespace. What am I doing wrong? select round(sum (bytes/1024/1024/1024),2) GB from dba_segments Where tablespace_name='TOPREP_DAT' and owner ='SAMSUNGLTE'; GB --- 2286.03
Categories: DBA Blogs


Tom Kyte - Mon, 2024-02-26 13:46
Categories: DBA Blogs

Object Dependency with RPC-Signature Dependency Mode

Tom Kyte - Mon, 2024-02-26 13:46
Dear AskTom team, I am happy that you again available for questions :-) I was studying the 'Database Development Guide - 26 Understanding Schema Object Dependency' and focused on the topic '26.10.2 RPC-Signature Dependency Mode'. There is written: 'Changing the data type of a parameter to another data type in the same class does not change the RPC signature, but changing the data type to a data type in another class does.' After studying I tried it out on LiveSQL. Sadly the dependent object always gets invalid after I changed the parameter of the referenced object to another data type in the same class (eg from 'number' to 'integer') - refer to my LiveSQL link. I tried to understand it but I didn't. What do I wrong here? Or did I got the documentation wrong? Thanks for your support! Greetings, Walter
Categories: DBA Blogs

UTL_FILE.FGETATTR can not find an existing file

Tom Kyte - Mon, 2024-02-26 13:46
I created a text file on oracle database server. The name of the file is 'TestFile' and it is located in C:\TestFolder\TestFile.txt . All C drive and 'TestFolder' folder and 'TestFile.txt' file have full control permission for everyone OS users. I create a directory: Create directory CheckFileExist as 'C:\TestFolder'; I grant read and write permissions on CheckFileExist directory to SYS oracle user: Grant read, write on directory CheckFileExist to SYS; I wrote a query so that Oracle can find the 'TestFile.txt' file or not: Declare V_File_Exists Boolean; V_File_Length Number; V_File_Size Number; Directory_Name Nvarchar2(255):='CheckFileExist'; Begin UTL_FILE.FGETATTR (Directory_Name, 'TestFile', V_File_Exists, V_File_Length, V_File_Size); If V_File_Exists Then DBMS_OUTPUT.PUT_LINE('File exists'); Else DBMS_OUTPUT.PUT_LINE('File does not exist'); End if; End; When I execute the query, the result is that File does not exist. What is the problem?
Categories: DBA Blogs

SQL Server: Manage large data ranges using partitioning

Yann Neuhaus - Mon, 2024-02-26 10:59

When it comes to moving ranges of data with many rows across different tables in SQL-Server, the partitioning functionality of SQL-Server can provide a good solution for manageability and performance optimizing. In this blog we will look at the different advantages and the concept of partitioning in SQL-Server.

Concept overview: 

In SQL-Server, partitioning can divide the data of an index or table into smaller units. These units are called partitions. For that purpose, every row is assigned to a range and every range in turn is assigned to a specific partition. Practically there are two main components: The partition function and the partition scheme.  

The partition function defines the range borders through boundary values and thus the number of partitions, in consideration with the data values, as well. You can define a partition function either as “range right” or “range left”. The main difference is how the boundary value gets treated. In a range right partition function, the boundary value is the first value of the next partition while in a range left partition function the boundary value is the last value of the previous partition. For example: 

We want to partition a table by year and the datatype of the column where we want to apply the partition function has the datatype “date”. Totally we have entries for the year 2023 and 2024 which means, we want 2x partitions. In a range right function, the boundary value must be the first day of the year 2024 whereas in a range left function the boundary value must be the last day of the year 2023.  

See example below: 

 Partition right  Partition left

The partition scheme is used to map the different partitions, which are defined through the partition function, to multiple or one filegroup.

Main benefits of partitioning:

There are multiple scenarios where performance or manageability of a data model can be increased through partitioning. The main advantage of partitioning is that it reduces the contention on the whole table as a database object and restricts it to the partition level when performing operations on the corresponding data range. Partitioning also facilitates data transfer with the “switch partition” statement, this statement performs a switch-in or switch-out of o whole partition. Through that, a large amount of data can be transferred very quickly.

Demo Lab:

For demo purposes I created the following script, which will create three tables with 5 million rows of historical data from 11 years in the past until today:

USE [master] 
--Create Test Database 
CREATE DATABASE [TestPartition] 
--Change Recovery Model 
--Create Tables 
Use [TestPartition] 
CREATE TABLE [dbo].[Table01_HEAP]( 
[Entry_Datetime] [datetime] NOT NULL, 
[Entry_Text] [nvarchar](50) NULL 
[Entry_Datetime] [datetime] NOT NULL, 
[Entry_Text] [nvarchar](50) NULL 
[Entry_Datetime] [datetime] NOT NULL, 
[Entry_Text] [nvarchar](50) NULL 
declare @date as datetime 
declare @YearSubtract int 
declare @DaySubtract int 
declare @HourSubtract int 
declare @MinuteSubtract int  
declare @SecondSubtract int  
declare @MilliSubtract int 
--Specifiy how many Years backwards data should be generated 
declare @YearsBackward int 
set @YearsBackward = 11 
--Specifiy how many rows of data should be generated 
declare @rows2generate int 
set @rows2generate = 5000000 
declare @counter int 
set @counter = 1 
--generate data entries 
while @counter <= @rows2generate  
Set @YearSubtract = floor(rand() * (@YearsBackward - 0 + 1)) + 0 
Set @DaySubtract = floor(rand() * (365 - 0 + 1)) + 0 
Set @HourSubtract = floor(rand() * (24 - 0 + 1)) + 0 
Set @MinuteSubtract = floor(rand() * (60 - 0 + 1)) + 0 
Set @SecondSubtract = floor(rand() * (60 - 0 + 1)) + 0 
Set @MilliSubtract = floor(rand() * (1000 - 0 + 1)) + 0 
set @date = Dateadd(YEAR, -@YearSubtract , Getdate()) 
set @date = Dateadd(DAY, -@DaySubtract , @date) 
set @date = Dateadd(HOUR, -@HourSubtract , @date) 
set @date = Dateadd(MINUTE, -@MinuteSubtract , @date) 
set @date = Dateadd(SECOND, -@SecondSubtract , @date) 
set @date = Dateadd(MILLISECOND, @MilliSubtract , @date) 
insert into Table01_HEAP (Entry_Datetime, Entry_Text) 
Values (@date, 'This is a entry from ' + convert(nvarchar, @date, 29)) 
set @counter = @counter + 1 
  (Entry_Datetime, Entry_Text) 
SELECT Entry_Datetime, Entry_Text 
  FROM Table01_HEAP 
  (Entry_Datetime, Entry_Text) 
SELECT Entry_Datetime, Entry_Text 
  FROM Table01_HEAP 
--Create Clustered Indexes for dbo.Table01_CLUSTEREDINDEX and dbo.Table01_PARTITIONED 
[Entry_Datetime] ASC 
) on [PRIMARY] 
[Entry_Datetime] ASC 
) on [PRIMARY] 

The tables have the same data in it. The difference between the tables is, that one is a heap, one has clustered index and one has a clustered Index which will be partitioned in the next step:

 Generated data

After the tables are created with the corresponding data and indexes, the partition function and scheme must be created. This was done by the following script:

-- Create Partition Function as range right for every Year -10 Years 
Create Partition Function [myPF01_datetime] (datetime) 
AS Range Right for Values ( 
DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) + 0, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 1, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 2, 0), 
DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 3, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 4, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 5, 0),  
DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 6, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 7, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 8, 0),  
DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 9, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 10, 0) 
-- Create Partition Scheme for Partition Function myPF01_datetime 

I have used the DATEADD() function in combination with the DATEDIFF() function to retrieve the first millisecond of the year as datetime data type and that for the last 10 years and used this as range right boundary values. For sure it is also possible to hard code the boundary values like ‘2014-01-01 00:00:00.000’ but I prefer to keep it as dynamically as possible. At the end it is the same result:

 Select Dateadd - function

After creating the partition function, I have created the partition scheme. The partition scheme is mapped to the partition function. In my case I assign every partition to the primary filegroup. It is also possible to split the partitions across multiple filegroups.

As far as the partition function and scheme are created successfully it can be applied to the existing table: Table01_PARTITIONED. For achieving that, the clustered index of the table must be recreated on the partition scheme instead of the primary filegroup: 

-- Apply partitiononing on Table: Table01_PARTITIONED through recreating the Tables Clustered Index ClusteredIndex_Table01_PARTITIONED on Partition Scheme myPS01_datetime 
[Entry_Datetime] ASC 
) with (DROP_EXISTING = ON) on myPS01_datetime(Entry_Datetime);  

After doing that, the Table Table01_PARTITIONED has multiple partitions while the other tables have still only one partition: 

 Partitions of partitioned table  Partitions of clustered index table

There are at all 12 partitions for every year between 2014 and 2024 as well as one for every entry which has an earlier datetime than 2014-01-01 00:00:00.000 and one for every entry that has a later datetime value than 2024-01-01 00:00:00.000 while partition nr. 1 has the earliest data and partition nr. 12 has the latest data in it. See below: 

 Content of partition 1  content of partition 12 DEMO Tests:

First, I want to compare the performance when moving outdated data, which is older than 2014-01-01 00:00:00.000, from the table itself to a history table. For that purpose, I created a history table with the same data structure as the table Table01_CLUSTEREDINDEX:

Use [TestPartition] 
--Create History Table 
CREATE TABLE [dbo].[Table01_HISTORY01]( 
[Entry_Datetime] [datetime] NOT NULL, 
[Entry_Text] [nvarchar](50) NULL 
--Create Clustered Indexes for dbo.Table01_HISTORY01 
CREATE CLUSTERED INDEX [ClusteredIndex_Table01_HISTORY01] ON [dbo].[Table01_HISTORY01] 
[Entry_Datetime] ASC 
) on [PRIMARY] 

I am starting first with the table with the clustered index with a classic “insert into select” statement:

 Select insert data into history

We can see that we have 10932 reads in total and a total query run time of 761 milliseconds.

 Execution plan select insert

In the execution plan, we can see that a classical Index seek operation occurred. Which means, the database engine seeked for every row which has a datetime value previous to 2014-01-01 00:00:00.000 and wrote it into the history table.

For the delete operation we can see similar results:

 delete rows  Delete rows

Totally 785099 rows where moved and we have in the table Table01_CLUSTEREDINDEX no older entries than 2014-01-01 00:00:00.000 anymore:

 Verify table content

Next let us compare the data movement when using a “switch partition” statement. For switching a partition from a partitioned source table to a nonpartitioned destination table, we need to use the partition number of the source table. For that I run the following query:

 Switch partition

We can see that the partition number 1 was moved within 2 milliseconds. Compared to the previous query where it took 761 milliseconds for inserting the data and an additional 596 milliseconds for deleting the data, the switch partition operation is obviously much faster. But why is this the case? – that’s because switching partitions is a metadata operation. It does not seeking through an index (or even worse – scanning a table) and write every row one by one, instead it changes the metadata of the partition and remaps the partition to the target table. 

And as we can see, we have the same result:

 verify table content

Another big advantage is when it comes to deleting a whole data range. For example: Let us delete the entries of the year 2017 – we do not need them anymore.

For the table with the clustered Index, we must use a statement like this:

 delete operation

We can see that we have here a query runtime of 355 milliseconds and 68351 page reads in total for the delete operation with the clustered index. 

For the partitioned table instead, we can use a truncate operation on the specific partition. That’s because the partition is treated as a own physical unit and can for that be truncated.

And as we should know: Truncating is much faster, because this operation is deallocating the pages and writes only one transaction log entry for the page deallocation while a delete operation is going row by row and writes every row deletion in the transaction log.

So, let us try: The year 2017 is 7 years back so let us verify, that the right data range will be deleted:

 verify partition content

We can see with the short query above: 7 Years back, that would be the partition nr. 5 and the data range seems to be right.  So, let us truncate:

 Truncate partition

And we can see to truncate all the entries from the year 2017, the database engine took 1 millisecond compared to the 355 seconds for the delete operation again much faster.

Next: let’s see, how we can change the lock behavior of SQL-Server through partitioning. For that I ran the following update query for updating every entry text for dates which are younger than May 2018:

 Update data entries

While the update operation above was running, I queried the DMV sys.dm_tran_locks in another session for checking the locks my update operation above is holding: 

 lock contention

And we can see that we have a lot of page locks and also an exclusive lock on the object itself (in this case the Table01_HEAP).  That is because of SQL-Servers lock escalation behavior.

I ran the same update operation on the partitioned table but before I changed the lock escalation setting of the table from default value “table” to “auto”. This is necessary for enable locking on partition level: 

 Update lock escalation

And when I’m querying the dmv again while the update operation above is running, I get the following result:

 lock contention

We can see that we have no exclusive look on abject level anymore, we have an intended exclusive look, which will not prevent other transactions from accessing the data (as far as it has no other look on a more granular level). Instead, we have multiple exclusive looks on multiple resources called HOBT. And when we take a look at the “resource_associated_entity_id” and using them for querying the sys.partitions table, we can see the following information’s: 

 locked partitions

These resources locked through the update operation on the partitioned table are the partitions associated with the table. So, SQL-Server locked the partitions instead of locking the whole table. This has the advantage that locking happens in a more granular context which prevents lock contention on the table itself. 


Partitioning can be a very powerful and useful functionality in SQL-Server when used in an appropriate situation. Especially when it comes to regular operations on whole data ranges, partitioning can be used for enhancing performance and manageability. With partitioning, it’s also possible to distribute the data of a table over multiple files groups. Additionally with splitting and merging partitions it’s possible to maintain partitions for growing or shrinking data.

L’article SQL Server: Manage large data ranges using partitioning est apparu en premier sur dbi Blog.

Kubernetes Networking by Using Cilium – Intermediate Level – Traditional Linux Routing

Yann Neuhaus - Mon, 2024-02-26 02:34

Welcome back in this blog post series about Kubernetes Networking by using Cilium. In the previous post about network interfaces, we’ve looked at how we can identify all the interfaces that will be involved in the routing between pods. I’ve also explained the routing in a Kubernetes cluster with Cilium in a non technical language in this blog post. Let’s now see it into actions for the techies!

Below is the drawing of where we left off:

We will continue to use the same method, you are the packet that will travel from your apartment (pod) on the top left to other pods in this Kubernetes cluster. But first, let’s take this opportunity to talk about namespace and enrich our drawing with a new analogy.

Routing between namespaces

A namespace is a logical group of objects that provide isolation in the cluster. However, by default, all pods can communicate together in a “vanilla” Kubernetes. Whatever they belong to the same namespace or not. So this isolation provided by namespace doesn’t mean the pods can’t communicate together. To allow or deny such communication, you will need to create network policies. That could be the topic for another blog post!

We can use the analogy of a namespace being the same floor number of all building of our cluster. All apartments on the same floor in each building will be logically grouped into the same namespace. This is what we can see below in our namespace called networking101:

$ kubectl get po -n networking101 -owide
NAME                        READY   STATUS    RESTARTS       AGE    IP            NODE                NOMINATED NODE   READINESS GATES
busybox-c8bbbbb84-fmhwc     1/1     Running   1 (125m ago)   4d1h   mycluster-worker2   <none>           <none>
busybox-c8bbbbb84-t6ggh     1/1     Running   1 (125m ago)   4d1h   mycluster-worker    <none>           <none>
netshoot-7d996d7884-fwt8z   1/1     Running   0              103m   mycluster-worker    <none>           <none>
netshoot-7d996d7884-gcxrm   1/1     Running   0              103m   mycluster-worker2   <none>           <none>

That’s our 4 apartments / pods on the same floor, grouped together in one namespace:

The routing process doesn’t care about the pod’s namespace, only its destination IP Address will be used. Let’s now see how we can go from the apartment to the apartment in the same building (node).

Pod to pod routing on the same node

From the pod, you’ve then decided to go to pay a visit to You first look at the routing table in order to know how to reach this destination. But you can’t go out if you don’t also have the MAC Address of your destination. You need both destination information (IP Address and MAC Address) before you can start to travel. You then look at the ARP table to find out this information. The ARP table contains the known mapping of a MAC Address to an IP Address in your IP subnet. If it is not there, you send first a scout to knock at the door of each apartment in your community until you find the MAC Address of your destination. This is called the ARP request. When the scout comes back with that information, you write it into the ARP table. You thank the scout for his help and are now ready to start your travel by exiting the pod.

Let’s see how we can trace this in our source pod

$ kubectl exec -it -n networking101 busybox-c8bbbbb84-t6ggh -- ip route
default via dev eth0 dev eth0 scope link

Very simple routing instruction! For every destination, you go through by using your only network interface eth0 in the pod. You can see from the drawing above that is the IP Address of the cilium_host. You then check your ARP table:

$ kubectl exec -it -n networking101 busybox-c8bbbbb84-t6ggh -- arp -a

The arp -a command list the content of the ARP table and we can see there is nothing in there.

A way to send a scout out is by using the arping tool toward our destination. You may have noticed that for my pods I’m using busybox and netshoot images. Both provide networking tools that are useful for troubleshooting:

$ kubectl exec -it -n networking101 busybox-c8bbbbb84-t6ggh -- arping
ARPING from eth0
Unicast reply from [d6:21:74:eb:67:6b] 0.028ms
Unicast reply from [d6:21:74:eb:67:6b] 0.092ms
Unicast reply from [d6:21:74:eb:67:6b] 0.123ms
^CSent 3 probe(s) (1 broadcast(s))
Received 3 response(s) (0 request(s), 0 broadcast(s))

We now have the piece of information that was missing, the MAC address of our destination. We can then just check it is written into our ARP table of our source pod:

$ kubectl exec -it -n networking101 busybox-c8bbbbb84-t6ggh -- arp -a
? ( at d6:21:74:eb:67:6b [ether]  on eth0

Here it is! However you may wonder why we don’t see here the IP Address of our destination right? In traditional networking this is what you will see but here we are in a Kubernetes cluster and we are using Cilium that is taking care of the networking in it. Also we have seen above from the routing table of the source pod that for every destination we go to this cilium_host interface.

So the cilium_host on that node is attracting all the traffic even for communication between pods in the same IP subnet.

As a side note, below is a command where you can quickly display all the IP Addresses of the cilium_host and the nodes in your cluster in one shot:

$ kubectl get ciliumnodes
mycluster-control-plane   122d
mycluster-worker   122d
mycluster-worker2   122d

In traditional networking, doing L2 switching, the MAC Address of the destination is the one related to the destination IP Address. That is not the case here in Kubernetes networking. So which interface has the MAC Address d6:21:74:eb:67:6b ? Let’s respond to that question immediately:

$ sudo docker exec -it mycluster-worker ip a | grep -iB1 d6:21:74:eb:67:6b
9: lxc4a891387ff1a@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether d6:21:74:eb:67:6b brd ff:ff:ff:ff:ff:ff link-netns cni-67a5da05-a221-ade5-08dc-64808339ad05

That is the LXC interface of the node as it is indeed our next step from the source pod to reach our destination. You’ve learned from my first post blog of this networking series that there is a servant waiting here at the LXC interface to direct us toward our destination.

From there, we don’t see much of the travel to the destination from the traditional Linux routing point of view. This is because the routing is done by the Cilium agent using eBPF. As the destination is in the same IP subnet as the source, the Cilium agent just switch it directly to the destination LXC interface and then reach the destination pod.

When the destination pod responds to the source, the same process occurs and for the sake of completeness let’s look at the routing table and ARP table in the destination pod:

$ kubectl exec -it -n networking101 netshoot-7d996d7884-fwt8z -- ip route
default via dev eth0 mtu 1450 dev eth0 scope link

$ kubectl exec -it -n networking101 netshoot-7d996d7884-fwt8z -- arp -a
? ( at 92:65:df:09:dd:28 [ether]  on eth0

$ sudo docker exec -it mycluster-worker ip a | grep -iB1 92:65:df:09:dd:28
13: lxce84a702bb02c@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 92:65:df:09:dd:28 brd ff:ff:ff:ff:ff:ff link-netns cni-d259ef79-a81c-eba6-1255-6e46b8d1c779

So from the traditional Linux routing point of view, everything goes to the cilium_host and the destination MAC address is the LXC interface of the node that is linked to our pod. This is exactly the same we have seen with our source pod.

Pod to pod routing on a different node

Let’s now have a look at how we could reach the pod from the source pod which is hosted in another node. The routing is the same at the beginning but when talking to the servant at the LXC interface, he sees that the destination IP Address doesn’t belong to the same IP subnet and so directs us to the cilium_host in the Lobby. From there we are routed to the cilium_vxlan interface to reach the node that host our destination pod.

Let’s now have a look at the routing table of the host:

$ sudo docker exec -it mycluster-worker ip route
default via dev eth0 via dev cilium_host proto kernel src mtu 1450 via dev cilium_host proto kernel src mtu 1450 via dev cilium_host proto kernel src dev cilium_host proto kernel scope link dev eth0 proto kernel scope link src

We don’t see much here as the routing is using eBPF and is managed by the Cilium agent as we’ve seen before.

As a side note and to share everything with you, the output of the network interfaces as well as the ip route in the Cilium agent pod is identical to the one of the node. This is because at startup the Cilium agent provides these information to the node. You can check the Cilium agent with the following commands:

$ kubectl exec -it -n kube-system cilium-dprvh -- ip a
$ kubectl exec -it -n kube-system cilium-dprvh -- ip route

So you go through the VXLAN tunnel and you reach the node mycluster-worker2. Here is the routing table of this node:

$ sudo docker exec -it mycluster-worker2 ip route
default via dev eth0 via dev cilium_host proto kernel src mtu 1450 via dev cilium_host proto kernel src dev cilium_host proto kernel scope link via dev cilium_host proto kernel src mtu 1450 dev eth0 proto kernel scope link src

Again from the traditional Linux routing point of view there isn’t much to see, except that all the traffic for the pods subnet are going to the cilium_host that is managed by the Cilium agent. This is identical as what we’ve learned in the other node. When we reach the cilium_vxlan interface, a servant is waiting for us with his magical eBPF map and directs us through a secret passage to the LXC corridor interface of the top left pod where we can reach our destination.

Wrap up

We’ve explored all that can be seen in routing from the traditional Linux point of view by using the common networking tools.

Maybe you feel frustrated to not understand it completely because there are some gaps in this step-by-step packet routing? Cilium uses eBPF for routing the packets so it adds some complexity to the routing understanding. However it is much faster than the traditional Linux routing due to the secret passages opened by the eBPF servants.

If you want to know more about this, don’t miss my next blog post where I’ll dive deep into the meanders of eBPF routing. See you there!

L’article Kubernetes Networking by Using Cilium – Intermediate Level – Traditional Linux Routing est apparu en premier sur dbi Blog.

LLM Agents with Sparrow

Andrejus Baranovski - Mon, 2024-02-26 01:53
I explain new functionality in Sparrow - LLM agents support. This means you can implement independently running agents, and invoke them from CLI or API. This makes it easier to run various LLM related processing within Sparrow. 



Subscribe to Oracle FAQ aggregator